Page MenuHomeMiraheze

Six questions about "Status of external links" and Extension:RottenLinks
Closed, ResolvedPublic

Description

On nenawiki.org -

The “Status of external links” is very cool and could be a very valuable feature for us. We have lots of external links that often get changed or removed. Keeping them up to date is important to us.

But the behavior on nenawiki.org looks inconsistent. Here are some examples:

  1. Links missing from the list.

We added a page on 2/26 with external links: https://nenawiki.org/wiki/NENA_NG9-1-1_Go-To_Handbook_(Review_Draft)

This is one of them: https://content.govdelivery.com/accounts/USDOTNHTSA911/bulletins/198231b

Today, the “Status of external links” statistics page shows that “Script Data” run date is 5 March 2019.

But the link on a page we added on 2/26 is not on the list of external links. Should it be there?

  1. Fixed links still on the list.

I updated the rotten link http://dlsforum.org, but it is still on the list. Should it go away after being fixed?

Questions about the “Status of external links” which seems to be Extension:RottenLinks listed in Special:Version and described on Mediawiki but not on our Miraheze managed extensions.

  1. When does the script run?
  2. Can the script schedule be modified or run manually?
  3. How long does it take for the script to capture the external links on new or edited pages?
  4. Can the date be set to our default ET?

Thanks,

Mike Vislocky

Event Timeline

MikeV created this task.Mar 5 2019, 15:50
MikeV added a comment.Mar 5 2019, 17:48

Update... It looks like some time today, (can't be sure) the link status was updated. Did you guys do something?

Now there are two links shown as "No Response", but are actually OK:

https://niem.gtri.gatech.edu/niemtools/iepdt/display/container.iepd?ref=CPnFNm8J4lA
https://niem.gtri.gatech.edu/niemtools/iepdt/download/resource.iepd?ref=__bVqwEPq3o

And I'm still curious about Questions 3-6.

@John, since it's your extension

Paladox added a subscriber: Paladox.Mar 5 2019, 18:03

answers to question 3-6.

  1. It's run every 2 week's https://github.com/miraheze/puppet/blob/master/modules/mediawiki/manifests/jobrunner.pp#L74 but i guess you could file a feature request requesting a "Update Now" button.
  1. The script can be ran manually.
  1. It depends on how many links you have on your wiki and how fast the site is. So it can take upwards from a minute for wiki's that have little external links compared to bigger wiki's that have thousands or millions of external links. I currently do not have a ETA or estimate on how long it would take.
  1. nope, but going back to "Update Now" button, you can request that as a feature.
John removed John as the assignee of this task.Mar 5 2019, 18:40
John added a subscriber: John.
MikeV added a comment.Mar 5 2019, 22:04

Thanks.

I guess I did not understand this on the "Show link statistics page":

Script Data
Script execution time
154 seconds
Script run date
21:59, 5 March 2019

It looked like the script just ran. It always does.

John closed this task as Resolved.Mar 8 2019, 13:00
John assigned this task to Paladox.

6 is resolved.

1 should be resolved when I ran the script the other day.

2 is intentional - it lists all links.

3-4 is answered above.

5 is when the next time the script is ran.

MikeV reopened this task as Open.Mar 8 2019, 13:43

Two of the five links shown in the "rotten links" status page have always been good links.

John closed this task as Resolved.Mar 8 2019, 14:21
> $ch = curl_init( 'https://niem.gtri.gatech.edu/niemtools/iepdt/download/resource.iepd?ref=__bVqwEPq3o' );

> curl_exec( $ch );

> var_dump( curl_getinfo( $ch ) );
array(30) {
  ["url"]=>
  string(83) "https://niem.gtri.gatech.edu/niemtools/iepdt/download/resource.iepd?ref=__bVqwEPq3o"
  ["content_type"]=>
  NULL
  ["http_code"]=>
  int(0)

PHP cURL is not able to connect and get a response from the URL.

Looking further reveals;

> var_dump( curl_error( $ch ) );
string(63) "SSL certificate problem: unable to get local issuer certificate"

Therefore this is an issue with the website and not Miraheze or the extension.

MikeV added a comment.Mar 8 2019, 15:02

This is a little bewildering. The HTTP Response is clearly not "No Response." It looks like there is a response and it is more like an "SSL Error".

Should I submit a feature request somewhere to change the "HTTP Response" results to help us resolve the rotten links?

Where are your HTTP Response codes defined?

Thanks,

Mike

John added a comment.Mar 8 2019, 15:13

The HTTP Code returned by cURL is 0. Which we handle as “no response” because it means “no response was received by the request” to cURL.

MikeV added a comment.Mar 8 2019, 15:20

https://niem.gtri.gatech.edu/niemtools/iepdt/display/container.iepd?ref=CPnFNm8J4lA

This URL can be successfully opened in Chrome, Edge, Firefox, and Internet Explorer without any indication of a problem.

What, if any, value is a 0 response from cURL (whatever that is)?

John added a comment.Mar 8 2019, 15:51

HTTP Response 0 in cURL means the request failed, as per the error above. But the actual request itself doesn't return any indication about this - it's further debugging to find the error which is more complex than can reasonably be integrated into the extension without increasing resource usage.