Page MenuHomeMiraheze

Make RottenLinks more effective at status detection
Closed, ResolvedPublic

Description

Currently we straight up do an HTTP check and then check one part of the returned request. We can probably minimise the size of the request made, minimise bandwidth usage and then introduce more effective and advanced checking of the returns.

Event Timeline

John triaged this task as Low priority.Jun 6 2020, 11:20
John created this task.

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();

Just to mention, about to do PR for this, but the final return here is not entirely correct. It would be return (int)$request->getStatusValue()->getValue(); instead because ->execute returns a Status instance, which doesn't have getStatus(), so we use getStatusValue() to get an instance of StatusValue and then finally getValue() to get correct http response code from the status message.

Since T7297 is considered to be a duplicate:

MediaWiki offers the HttpRequestFactory class to make HTTP calls in a standardised manner. The class ensures MediaWiki's internal logging features (e.g. 'http' log channel) and configurations settings (e.g. http_proxy) are used upon executing HTTP calls. Instead, RottenLinks uses the curl_ functions directly.
Example code (untested!):

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)->execute();
return (int)$request->getStatus();

Just to mention, about to do PR for this, but the final return here is not entirely correct. It would be return (int)$request->getStatusValue()->getValue(); instead because ->execute returns a Status instance, which doesn't have getStatus(), so we use getStatusValue() to get an instance of StatusValue and then finally getValue() to get correct http response code from the status message.

Sorry, I messed up my code after rewriting it. You should not chain ->getStatus() after ->execute(). See this example:

$request = MediaWikiServices::getInstance()->getHttpRequestFactory()->create(
        $url,
	[ 
	'method' => 'HEAD', // return headers only
	'timeout' => $config->get( 'RottenLinksCurlTimeout' ),
		'userAgent' => 'RottenLinks, MediaWiki extension (https://github.com/miraheze/RottenLinks), running on ' . $config->get( 'Server' )
	],
	__METHOD__
)
$reqexec = $request->execute();
return (int)$request->getStatus();

I merged https://github.com/miraheze/RottenLinks/pull/32. Not sure what else needs done for this task (if anything) so leaving open for now.

Universal_Omega claimed this task.

Closing as resolved under the assumption that doing what Southparkfan mentioned completes this task, if that is wrong, do reopen this task. Thank you!