Page MenuHomeMiraheze

Make RottenLinks more effective at status detection
Closed, ResolvedPublic

Description

Currently we straight up do an HTTP check and then check one part of the returned request. We can probably minimise the size of the request made, minimise bandwidth usage and then introduce more effective and advanced checking of the returns.

Event Timeline

John triaged this task as Low priority.Jun 6 2020, 11:20
John created this task.
Unknown Object (User) moved this task from Backlog to Features on the RottenLinks board.Jan 10 2021, 08:05
Unknown Object (User) moved this task from Features to Maintenance on the RottenLinks board.Feb 5 2021, 18:54
Unknown Object (User) added a project: Universal Omega.Mar 23 2021, 01:01
Unknown Object (User) moved this task from Unsorted to Goals on the Universal Omega board.
Unknown Object (User) moved this task from Maintenance to Features on the RottenLinks board.Mar 24 2021, 02:06
Unknown Object (User) moved this task from Goals to Long Term on the Universal Omega board.Mar 24 2021, 22:06
Unknown Object (User) removed a project: Universal Omega.Apr 3 2021, 06:54
Unknown Object (User) unsubscribed.Apr 3 2021, 19:58

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();
Unknown Object (User) added a comment.EditedMay 13 2021, 19:19

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();

Just to mention, about to do PR for this, but the final return here is not entirely correct. It would be return (int)$request->getStatusValue()->getValue(); instead because ->execute returns a Status instance, which doesn't have getStatus(), so we use getStatusValue() to get an instance of StatusValue and then finally getValue() to get correct http response code from the status message.

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();

Just to mention, about to do PR for this, but the final return here is not entirely correct. It would be return (int)$request->getStatusValue()->getValue(); instead because ->execute returns a Status instance, which doesn't have getStatus(), so we use getStatusValue() to get an instance of StatusValue and then finally getValue() to get correct http response code from the status message.

Sorry, I messed up my code after rewriting it. You should not chain ->getStatus() after ->execute(). See this example:

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)
$reqexec = $request->execute();
return (int)$request->getStatus();
Unknown Object (User) added a comment.May 13 2021, 20:54

I merged https://github.com/miraheze/RottenLinks/pull/32. Not sure what else needs done for this task (if anything) so leaving open for now.

Unknown Object (User) closed this task as Resolved.Jun 15 2021, 04:34
Unknown Object (User) claimed this task.
Unknown Object (User) added a project: Universal Omega.

Closing as resolved under the assumption that doing what Southparkfan mentioned completes this task, if that is wrong, do reopen this task. Thank you!

Unknown Object (User) moved this task from Long Term to Goals on the MediaWiki (SRE) board.Jun 15 2021, 04:34
Unknown Object (User) moved this task from Goals to Long Term on the MediaWiki (SRE) board.

Sorry.