User:Monkbot/task 19: cite iucn update

Task 19 was originally conceived to update, from the IUCN Red List API, the 13,000 or so articles that use {{cite IUCN}} where |url= holds an old-form IUCN url. These articles are listed in Category:cite IUCN maint (1,637).

There are several old-form urls (not all of these work):

Old-form urls are considered 'old-form' because (when they work) they always point to the current assessment.

Most of these old-form urls are used in {{cite IUCN}} templates that are found in the |status_ref= parameter of {{speciesbox}} and {{taxobox}} templates (collectively hereafter 'taxobox') to support the values in the taxobox |status= and |status_system= parameters. Because values for |status= (IUCN uses the term 'category') and for |status_system= can be extracted or derived from the results of an additional IUCN API call, task 19 was expanded to support updating these taxobox parameters.

IUCN API

This task is generally slow. IUCN do not want anyone or anything hammering away at their API as fast as possible so task 19's calls to the IUCN API are spaced about 3 seconds apart. To accomplish this, the AWB Bots→Auto save→Delay setting is 3 seconds. This prevents task 19 from making edits that require only a single IUCN API call too quickly. For edits that require multiple IUCN API calls, task 19 imposes a 3-second pause before executing each IUCN API call after the first one.

IUCN API calls require a token. While the code for this task is published, the task's token is not. Anyone considering reuse of this code must obtain their own token; do not use the publicly available demo token.

Task 19 fetches data from the IUCN API in four forms; two of species data and two of species citations. These examples are for Anthus roseatus (the name) and 22718564 (the taxon id). The IUCN API returns for Anthus roseatus (name) and 22718564 (taxon id) are:

name:
{"name":"Anthus roseatus","result":[{"taxonid":22718564,"scientific_name":"Anthus roseatus","kingdom":"ANIMALIA","phylum":"CHORDATA","class":"AVES","order":"PASSERIFORMES","family":"MOTACILLIDAE","genus":"Anthus","main_common_name":"Rosy Pipit","authority":"Blyth, 1847","published_year":2019,"assessment_date":"2019-06-13","category":"LC","criteria":null,"population_trend":"Stable","marine_system":false,"freshwater_system":true,"terrestrial_system":true,"assessor":"BirdLife International","reviewer":"Smith, D.","aoo_km2":null,"eoo_km2":"3530000","elevation_upper":5000,"elevation_lower":2700,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}
taxon id:
{"name":"22718564","result":[{"taxonid":22718564,"scientific_name":"Anthus roseatus","kingdom":"ANIMALIA","phylum":"CHORDATA","class":"AVES","order":"PASSERIFORMES","family":"MOTACILLIDAE","genus":"Anthus","main_common_name":"Rosy Pipit","authority":"Blyth, 1847","published_year":2019,"assessment_date":"2019-06-13","category":"LC","criteria":null,"population_trend":"Stable","marine_system":false,"freshwater_system":true,"terrestrial_system":true,"assessor":"BirdLife International","reviewer":"Smith, D.","aoo_km2":null,"eoo_km2":"3530000","elevation_upper":5000,"elevation_lower":2700,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}

The citation data returns are:

name:
{"name":"Anthus roseatus","result":[{"citation":"BirdLife International 2019. Anthus roseatus. The IUCN Red List of Threatened Species 2019: e.T22718564A152671411. https://dx.doi.org/10.2305/IUCN.UK.2019-3.RLTS.T22718564A152671411.en .Downloaded on 21 September 2021"}]}
taxon id:
{"name":"22718564","result":[{"citation":"BirdLife International 2019. Anthus roseatus. The IUCN Red List of Threatened Species 2019: e.T22718564A152671411. https://dx.doi.org/10.2305/IUCN.UK.2019-3.RLTS.T22718564A152671411.en .Downloaded on 21 September 2021"}]}

taxobox updates

Task 19 confirms, updates, or adds taxobox parameters |status=, |status_system=, and |status_ref= using data extracted from the IUCN API. The IUCN API data are fetched using a binomial species name; task 19 does not attempt to fetch IUCN API data using the taxon id found in any existing IUCN references in the taxobox. For taxobox updates, task 19 attempts to get the binomial from various taxobox parameters:

  • {{speciesbox}} parameters
    1. |taxon=
    2. |genus= + |species=
    3. |name=
  • {{taxobox}} parameters
    1. |binomial=
    2. |name=

when the taxobox has none of the above parameters, task 19 will use the article title in the IUCN API call.

Task 19 does not confirm, update, or add |status=, |status_system=, and |status_ref= when:

  • the binomial is not a binomial; usually because the taxobox or article title uses only the genus portion of the binomial
  • the IUCN API does not recognize the binomial as a valid name. When this happens task 19 adds Category:Taxobox binomials not recognized by IUCN and a hidden comment with the unrecognized binomial. Reasons that the IUCN API might not recognize the binomial are:
    • misspellings
    • typos
    • extraneous text
    • species name might not be 'globally assessed' but instead be 'regionally assessed' – the taxobox does not specify the region of an assessment so task 19 cannot use the regional form of the citation API call
    • IUCN API does not support the redirect-like behavior for binomials as the search box at https://www.iucnredlist.org/ does

{{speciesbox}} parameters |status2=, |status2_system=, and |status2_ref= are not handled in the same way as their non-enumerated counterparts. This is because there are relatively few instances of the enumerated forms (~25 according to this search 2021-09-20). |status2_ref= may be updated by subsequent task 19 processes but |status2= and |status2_system= will not be.

{{automatic taxobox}} and {{subspeciesbox}} support |status=, |status_system=, and |status_ref= but task 19 does not attempt to update these parameters as a group because the use of these parameters in those templates is comparatively rare and because species names upon which task 19 depends are inconsistent in comparison to {{speciesbox}} and {{taxobox}}. Task 19 may choose to update the content of |status_ref= in these templates if the parameter uses an old-form url or is a plain-text citation but will not attempt to update |status= and |status_system= nor will it remove duplicate |status_ref= references.

IUCN status

From the IUCN API call for species data using the binomial, task 19 extracts the category value and the assessment_date value. The species IUCN status is confirmed when |status= has the same value as the category returned from the IUCN API. When they are different, task 19 updates |status= to the value from the IUCN API. When |status= is missing (because it was never there or because an empty parameter was deleted) task 19 updates |status= or adds a new |status= at the end of the taxobox. Updates, confirmation, and additions are noted in the edit summary.

IUCN status displayed on an IUCNredlist web page may be different from the category returned from the IUCN API – task 19 uses the IUCN API's category; cf. (as of 2021-09-22):

  • NT (from the Zenia insignis web page)
  • LR/nt (from the IUCN API):
    {"name":"32462","result":[{"taxonid":32462,"scientific_name":"Zenia insignis","kingdom":"PLANTAE","phylum":"TRACHEOPHYTA","class":"MAGNOLIOPSIDA","order":"FABALES","family":"FABACEAE","genus":"Zenia","main_common_name":null,"authority":"Chun","published_year":1998,"assessment_date":"1998-01-01","category":"LR/nt","criteria":null,"population_trend":null,"marine_system":false,"freshwater_system":false,"terrestrial_system":true,"assessor":"World Conservation Monitoring Centre","reviewer":"","aoo_km2":null,"eoo_km2":null,"elevation_upper":null,"elevation_lower":null,"depth_upper":null,"depth_lower":null,"errata_flag":null,"errata_reason":null,"amended_flag":null,"amended_reason":null}]}

IUCN status system

To update or add a taxobox |status_system= parameter, task 19 extracts the year portion from the IUCN API's assessment_date value. If the assessment year is 2000 or earlier, task 19 sets |status_system=IUCN2.3 otherwise |status_system=IUCN3.1. The threshold date is taken from Wikipedia:Conservation status. When |status_system= is missing, task 19 adds a new parameter at the end of the taxobox. Updates and additions are noted in the edit summary, confirmations are not.

IUCN status reference

To update or add |status_ref=, task 19 inspects the parameter value for a date that task 19 would have written (<ref name="iucn status date">...</ref>) or the existing citation's |access-date= (in that order). When a date can be extracted from one of these, it is compared to the current date. Task 19 will attempt to update |status_ref= only when the difference between the current date and the reference date is greater than six months or when no date can be extracted. This six-month limit was arbitrarily chosen on the presumption that IUCN updates their database twice a year.

Task 19 will not update templated citations in |status_ref= if the citation has one of:

  • |amends=<year>
  • |errata=<year>

Similarly, task 19 will not update plain-text citations in |status_ref= if the citation has one of:

  • (amended version of <year> assessment)
  • (errata version published in <year>)

This because the IUCN API does not provide the <year> of amendment or errata.

When the six month limit is met, and when the citation in |status_ref= does not hold the amended or errata parameters or strings, task 19 then inspects the associated reference tag:

  1. <ref> – unnamed reference;
    • replaces the value assigned to |status_ref= with <ref name="iucn status date"><new {{cite IUCN}} from IUCN API></ref>
    where date in name="iucn status date" is a copy of the value assigned to the new {{cite IUCN}} template's |access-date= parameter
  2. <ref name=name> – named reference:
    • replaces that reference with <ref name="iucn status date"><new {{cite IUCN}} from IUCN API></ref>
    • replaces all instances of <ref name=name /> with <ref name="iucn status date" />
    where date in name="iucn status date" is a copy of the value assigned to the new {{cite IUCN}} template's |access-date= parameter
  3. <ref name=name /> – named self-closed reference:
    • swaps the self-closed reference tag with the reference definition
    • replaces the citation as described in 2
    • if the definition was (and now the self-closed ref tag is) inside {{reflist|refs=}} then the self-closed ref tag is deleted

{{cite IUCN}} template updates

For {{cite IUCN}} templates that have old-form urls, task 19 extracts the taxon id from the url and attempts to fetch citation data from the IUCN API using the taxon id. If the IUCN API does not recognize the taxon ID, task 19 will attempt to get a citation from the API by using the value assigned to |title= in the {{cite IUCN}} template. When successful, task 19 replaces the old {{cite IUCN}} template with a new {{cite IUCN}} template that has parameter values from the IUCN API citation.

When the taxon/assessment ids in a new {{cite IUCN}} template's |page= and |doi= parameters are not the same, the citation is not updated because {{cite IUCN}} will emit a |doi= / |url= mismatch error message. The mismatch is usually (usually) an indication that the assessment has errata. The citation rendered on an IUCN species web page indicates the errata year but, at the time of this writing, that value is not available in the citation returned from the IUCN API. IUCN have been notified of this discrepancy.

plain-text citation updates

For the purposes of this task, plain-text references are untemplated IUCN references inside named or unnamed <ref>...</ref> tags or IUCN references as a line item in an unordered list (* markup). Task 19 will update plain-text references when it can extract a taxon id from an IUCN page identifier (e.T###A###), from an IUCN doi (as a doi inside {{doi}} or as a url), or from an IUCN url.

duplicate citations

Task 19 will replace named and unnamed references that hold {{cite IUCN}} templates that match {{cite IUCN}} in |status_ref= with <ref name="iucn status date" /> tags. <ref name=name /> associated with named references that hold {{cite IUCN}} templates that match {{cite IUCN}} in |status_ref= are replaced with <ref name="iucn status date" /> tags.

Duplicate references that wholly make up an entry in an unordered list are deleted as redundant.

Task 19 does not remove any other references.

ancillary tasks

Task 19 may update a {{IUCN status}} template's status value in its first positional parameter ({{{1|}}}) from the IUCN API when {{IUCN status}} has a valid taxon id as its second positional parameter ({{{2|}}}).

As with all other monkbot tasks, task 19 does not run with AWB general fixes turned on.

abandoned edits

Task 19 will abandon edits when:

  • the article uses {{r}}
  • the article uses {{#tag:ref}} parser functions
  • the number of {{cite IUCN}} templates evaluated is equal to the number of IUCN API calls that returned nil values
  • the article contains {{bots|deny=monkbot/task 19}}

edit summaries

Task 19 emits terse edit summaries. An edit summary is a concatenation of one or more of these message fragments:

  • IUCN status confirmed (n×) – number of taxobox |status= and {{IUCN status}} values that were confirmed to match the IUCN API returned value; when there is only one confirmation (the most common case), the parenthetical count is omitted
  • IUCN status updated (n×) – number of taxobox |status= and {{IUCN status}} values that were updated to match the IUCN API returned value; when there is only one update, the parenthetical count is omitted
  • IUCN status added – a taxobox |status= parameter was added using the IUCN API returned value
  • IUCN status system updated – a taxobox |status_system= parameter was updated to match the IUCN API returned value
  • IUCN status system added – a taxobox |status_system= parameter was added using the IUCN API returned value
  • IUCN status ref updated – a taxobox |status_ref= parameter was updated to match the IUCN API returned value
  • IUCN status ref added – a taxobox |status_ref= parameter was added using the IUCN API returned value
    • [duplicate removed] or [duplicates removed (n×)] – suffix added to 'IUCN status ref updated' or 'IUCN status ref added' messages when duplicate reference(s) have been removed
  • IUCN status ref current – the citation in |status_ref= is not older than six months
  • evaluated n template(s) – the number of {{cite IUCN}} templates that task 19 inspected for use of old-form urls
  • n template(s) modified – the number of {{cite IUCN}} templates with old-form urls that task 19 updated
  • evaluated n reference(s) – the number of plain-text references that task 19 inspected
  • n reference(s) modified – the number of plain-text references that task 19 updated
  • API species nil return (id) (n×) – emitted when IUCN API did not return species data for a given taxon id
  • API species nil return (name) (n×) – emitted when IUCN API did not return species data for a given species name
  • API cite nil return (n×) – emitted when IUCN API did not return citation data (species name or taxon id)
  • unrecognized binomial: binomial – the binomial that task 19 used to fetch data from the IUCN API for the taxobox parameter
  • (n/mm:ss.ms) – n is the number of IUCN API calls; mm:ss.ms – minutes, seconds and milliseconds required to process the article

script

/*use the iucn api to fetch IUCN categories to update {{taxobox}} and {{speciesbox}} |status= and status_system=parametersuse the iucn api to fetch assessment citations to update {{taxobox}} and {{speciesbox}} |status_ref= parameterswith current {{cite IUCN}} templatesuse the iucn api to fetch assessment citations to update {{cite IUCN}} templates with old-form urlsuse the iucn api to fetch IUCN categories to update second positional parameter in {{IUCN status}} templatessource categories:Category:Taxonomy articles created by PolbotCategory:cite IUCN maintsource searches:insource:/Downloaded on [0-3][0-9] +[JFMASOND][a-z]+ +[0-9]{4}/hastemplate:"cite IUCN" -incategory:"Taxobox binomials not recognized by IUCN" -insource:/iucn status [0-9]+[^0-9]+2021/*///---------------------------< P R O C E S S A R T I C L E >--------------------------------------------------//////List<string> error_log_list = new List<string>();public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip){Skip = false;// assume that something will be changed// these use redirect to User:Monkbot/task 19: cite IUCN update//Summary = "[[User:Monkbot/task 19|Task 19]] (manual dev test): convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";//Summary = "[[User:Monkbot/task 19|Task 19]] (BRFA trial): convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";Summary = "[[User:Monkbot/task 19|Task 19]]: convert/update IUCN references to {{[[Template:cite IUCN|cite IUCN]]}} using data from [[IUCN Red List]] [[API]];";inttemplate_modified_count = 0;// number of cite IUCN templates that were modified from the iucn apiintother_template_modified_count = 0;// number of cite journal/web templates that were converted to {{cite IUCN}}// reset these static countersplain_text_modified_count = 0;// number of plain-text citations that were modified from the iucn apiplain_text_count = 0;// total number of plain-text iucn referencesapi_call_count = 0;// number of api calls madeapi_fetch_fail_count = 0;// number of api fetches that failedapi_no_cite_return_count = 0;// number of times that the api returned a non-citation valueapi_no_species_return_name_count = 0;// number of times that the api returned a non-species value (species binomial)api_no_species_return_id_count = 0;// number of times that the api returned a non-species value (species id for {{IUCN status}})iucn_status_confirmed_count = 0;// number of times that we confirmed the iucn status in taxobox-like templatesiucn_status_updated_count = 0;// number of times that we updated the iucn status in taxobox-like templatesiucn_status_system_updated_count = 0;// number of times that we updated the iucn status system in taxobox-like templatesiucn_template_count = 0;// total number of cite IUCN templatesother_template_count = 0;// total number of cite journal/web templatesparse_fail_count = 0;// number of times that we couldn't parse the api returnpage_doi_skip_count = 0;// number of templates or plain-text references skipped because page and doi assessment ID mismatchstatus_added = false;// set to true when |status= createdstatus_system_added = false;// set to true when |status_system createdstatus_ref_added = false;// set to true when |status_ref= createdstatus_ref_updated = false;// set to true when |status_ref= updatedstatus_ref_current = false;// set to true when |status_ref= less than 6 months oldduplicates_removed_count = 0;// number of duplicate status references removedtaxobox_blank = null;// gets blank taxobox as flagunrecognized_species_name = null;// gets taxobox species name that IUCN doesn't recognizeSystem.Diagnostics.Stopwatch stopwatch = new System.Diagnostics.Stopwatch();// set up a stopwatchstopwatch.Start();// and start itif (Regex.Match (ArticleText, @"\{\{\s*#tag:ref").Success){Summary = "Article uses {{#tag:ref}} parser function(s)";error_log_add ("Article uses " + code_nowiki("{{#tag:ref}}") + " parser function(s)");// add error message to listlog_errors (ArticleTitle, error_log_list);// dump list to the log fileSkip = true;return ArticleText;}if (Regex.Match (ArticleText, @"\{\{\s*[Rr]\s*\|").Success){Summary = "Article has {{r}} template(s)";error_log_add ("Article has " + code_nowiki("{{r}}") + " template(s)");// add error message to listlog_errors (ArticleTitle, error_log_list);// dump list to the log fileSkip = true;return ArticleText;}if (null == api_token){System.IO.StreamReader sr = new System.IO.StreamReader (iucn_api_token_file);// open the api token file for readingapi_token = "?token=" + sr.ReadLine();// read the token (must be the only thing in the file)sr.Close();// close and done}if (null == api_token)// but just in case{Summary = "Failed to read: " + iucn_api_token_file;// announce failureerror_log_add ("Failed to read: " + iucn_api_token_file);// add error message to listlog_errors (ArticleTitle, error_log_list);// dump list to the log fileSkip = true;return ArticleText;}ArticleText = Regex.Replace (ArticleText, @"[\r\n]+\[\[Category:Taxobox binomials not recognized by IUCN\]\][^\r\n]*", "");// remove if present; will be restored if necessary//---------------------------< T A X O B O X >----------------------------------------------------------------//// <taxobox> holds the content of {{taxobox}} or {{Speciesbox}} and then is modified by taxobox_update().  The// source template in <ArticleText> is replaced with an empty skeleton ('{{taxobox}}' or '{{Speciesbox}}' but// without contents.  At the end, this skeleton is replaced with the modified taxobox held in <taxobox>.//// The reason for this round-about is to prevent other portions of this script from evaluating and tallying// the reference in |status_ref=.  Also permits easy replacement of references that duplicate the reference in// |status_ref=.//ArticleText = Regex.Replace (ArticleText, hide_non_ref_tag_pattern, hide_non_ref_replace_val);ArticleText = hide (ArticleText, HIDE_ALL_BUT_TAXOBOX);// hide all templates except taxobox-like templatesArticleText = hide (ArticleText, HIDE_ALL_BUT_TAXOBOX);// hide all templates except taxobox-like templates//if (1 == 1) return ArticleText;string taxobox = taxobox_get (ArticleText);taxobox_status_ref = null;// reset the 'new' value for |status_ref; used at the end to remove duplicatestaxobox_status_ref_open_tag = null;// its matching ref open tagtaxobox_status_ref_sc_tag = null;// and its matching self-closed tagtaxobox_update (ref taxobox, ref ArticleText, ArticleTitle);// update the taxobox |status=, |status_system=, and |status_ref=ArticleText = unhide (ArticleText);//---------------------------< C I T E   I U C N   U P D A T E S >--------------------------------------------//// this segment updates {{cite IUCN}} templates that have old-form urls.  There are a variety of old-form urls// but the most common indicator is the taxon id followed by a zero (0) for the assessment id.  This section// fetches the current citation from the IUCN API using the taxon id (preferred) or the using the 'name' in |title=.// The 'name' in |title= is presumed to be an italicized binomial//// {{cite IUCN}} templates with |ref= holding any value retain the parameter so that {{sfn}} or {{harv}} links// aren't broken.  Any replacement citation that does not use |ref= may have a different author list from the// 'original' so, when the underlying {{cite journal}} creates a CITEREF id for the new name list, the {{sfn}}// or {{harv}} links will be broken ...//// does not update references in the taxobox (|status_ref= handled above); example: [[Picea abies]]//ArticleText = hide (ArticleText, IS_CITE_IUCN);// hide all templates except cite IUCN templatesif (Regex.Match (ArticleText, iucn_template_pattern).Success)ArticleText = Regex.Replace (ArticleText, iucn_template_pattern,delegate(Match match){stringtemplate = match.Groups[0].Value;// this will be returned if no changesstringref_param = null;iucn_template_count++;// bump total number of cite IUCN templates tallystring id = taxon_id_from_old_form_url_get (template);if (null == id)// not an old-form-url template so ignore itreturn template;if (Regex.Match (template, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success){error_log_add ("[cite IUCN update]: template has |errata= or |amends= parameter (id: " + id + ")");return template;}string name = null;if (Regex.Match (template, iucn_title).Success){name = Regex.Match (template, iucn_title).Groups[1].Value.Trim();name = species_name_cleanup (name);// remove markup, extinction markers, disambiguation, etc}string api_url_id = api_id_url + id + api_token;// build the url from its various partsstring api_url_name = api_name_url + name + api_token;// build the url from its various partsstring cite_iucn = cite_iucn_get (api_url_id, api_url_name, ArticleTitle, id, name);if (null == cite_iucn)return template;template = Regex.Replace (template, ref_param_empty, "$1");// remove empty |ref= parameters from templateif (Regex.Match (template, ref_param_not_empty).Success)// if this template has |ref=<something>ref_param = Regex.Match (template, ref_param_not_empty).Groups[1].Value.Trim();// get its assigned valueif (null != ref_param)cite_iucn = Regex.Replace (cite_iucn, @"(\}\})", " |ref=" + ref_param + "$1");// add the preexisting |ref= paramtemplate_modified_count++;return cite_iucn;});ArticleText = unhide (ArticleText);// unhide all that is hidden//---------------------------< C I T E   J O U R N A L / W E B   U P D A T E S >------------------------------//// this segment updates {{cite journal}} abd {{cite web}} templates that have iucn urls, or pages or dois.  This// section fetches the current citation from the IUCN API using the taxon id (preferred) or the using the 'name'// in |title=.  The 'name' in |title= is presumed to be an italicized binomial//// {{cite journal}} and {{cite web}} templates with |ref= holding any value retain the parameter so that {{sfn}}// or {{harv}} links aren't broken.  Any replacement {{cite IUCN}} that does not use |ref= may have a different// author list from the 'original' so, when the underlying {{cite journal}} creates a CITEREF id for the new name// list, the {{sfn}} or {{harv}} links will be broken ...//// does not update references in the taxobox (|status_ref= handled above)//ArticleText = hide (ArticleText, IS_CITE_OTHER);// hide all templates except cite journal and cite web templatesif (Regex.Match (ArticleText, other_template_pattern).Success)ArticleText = Regex.Replace (ArticleText, other_template_pattern,delegate(Match match){stringtemplate = match.Groups[0].Value;// this will be returned if no changesstringref_param = null;other_template_count++;// bump total number of cite journal/web templates tallystringid = plain_text_taxon_id_get (template);// attempt to get taxon id from page -> doi -> urlif (null == id)// not an 'iucn' template so ignore itreturn template;//cite journal and cite web don't support |errata= or |amends=//if (Regex.Match (template, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success)//{//error_log_add ("[cite IUCN update]: template has |errata= or |amends= parameter (id: " + id + ")");//return template;//}string name = null;if (Regex.Match (template, iucn_title).Success)// get value assigned to |title={name = Regex.Match (template, iucn_title).Groups[1].Value.Trim();name = species_name_cleanup (name);// remove markup, extinction markers, disambiguation, etc}string api_url_id = api_id_url + id + api_token;// build the api url from its various partsstring api_url_name = api_name_url + name + api_token;// build the api url from its various partsstring cite_iucn = cite_iucn_get (api_url_id, api_url_name, ArticleTitle, id, name);if (null == cite_iucn)return template;template = Regex.Replace (template, ref_param_empty, "$1");// remove empty |ref= parameters from templateif (Regex.Match (template, ref_param_not_empty).Success)// if this template has |ref=<something>ref_param = Regex.Match (template, ref_param_not_empty).Groups[1].Value.Trim();// get its assigned valueif (null != ref_param)cite_iucn = Regex.Replace (cite_iucn, @"(\}\})", " |ref=" + ref_param + "$1");// add the preexisting |ref= paramother_template_modified_count++;return cite_iucn;});ArticleText = unhide (ArticleText);// unhide all that is hidden//---------------------------< P L A I N _ T E X T _ R E F _ U P D A T E >------------------------------------//// update plain-text references first in ArticleText and then in the taxobox//ArticleText = plain_text_ref_update (ArticleText, ArticleTitle);// all of these create or rely on <ref iucn status <'date'>>{{cite IUCN}}if ((status_added || (0 != iucn_status_confirmed_count) || (0 != iucn_status_updated_count)) && (status_ref_added || status_ref_updated || status_ref_current))taxobox = plain_text_ref_update (taxobox, ArticleTitle);// do not update plain-text references in taxobox because |status_ref= might be plain text//---------------------------< I U C N   P L A I N - T E X T   B I B L I O G R A P H Y   U P D A T E >--------//// this is the plain-text form API id only.  Plain-text references in bibliographies must be in unordered list// markup \n*...\n//// known issues://because this attempts to locate 'correct' plain-text citations and because any non-template and non-//wikilink text is plain text, plain text that is part of the unordered list item that is not part of the//actual IUCN citation will be treated as part of the citation and will be replaced with the {{cite IUCN}}//template if the API returns a citation for the taxon id.//if (Regex.Match (ArticleText, plain_text_bib_pattern).Success)// must have the form \n*plain text\n must be constrained because article is plain textArticleText = Regex.Replace (ArticleText, plain_text_bib_pattern,delegate(Match match){stringplain_text = match.Groups[0].Value;// this will be returned if no changesstringtaxon_id = plain_text_taxon_id_get (plain_text);// attempt to get taxon idif (null == taxon_id)return plain_text;// no taxon id so abandonif (is_plain_text_rejected (plain_text))// returns true when plain_text is rejectedreturn plain_text;stringref_open = match.Groups[1].Value;// the opening \n*stringref_close = match.Groups[3].Value;// the closing \n tagplain_text_count++;// bump total number of plain-text references foundstring api_url = api_id_url + taxon_id + api_token;// build the url from its various partsstring cite_iucn = cite_iucn_get (api_url, null, ArticleTitle, taxon_id, null);// go build a {{cite IUCN}} template from the apiif (null == cite_iucn)return plain_text;// template build failedplain_text_modified_count++;return ref_open + cite_iucn + ref_close;});//---------------------------< I U C N   S T A T U S   T E M P L A T E >--------------------------------------//// Update status in {{IUCN status|<status>|<taxon id>|<options>}}//if (Regex.Match (ArticleText, iucn_status_template_pattern).Success)ArticleText = Regex.Replace (ArticleText, iucn_status_template_pattern,delegate(Match match){string template = match.Groups[0].Value;// if no change, return thisstring status = null;string id = null;if (Regex.Match (template, iucn_status_status).Success)status = Regex.Match (template, iucn_status_status).Groups[2].Value;elsereturn template;if (Regex.Match (template, iucn_status_id).Success)id = Regex.Match (template, iucn_status_id).Groups[2].Value;elsereturn template;string species_from_api;// species data from the API will go herestring api_url = api_species_id_url + id + api_token;// build the url from its various partsspecies_from_api = api_fetch (api_url, ArticleTitle);// fetch species data from the IUCN APIif (null == species_from_api)// if api_fetch() failedreturn template;stringstatus_from_api = null;if (Regex.Match (species_from_api, status_from_api_pattern).Success)status_from_api = Regex.Match (species_from_api, status_from_api_pattern).Groups[1].Value;else{error_log_add ("[iucn status template]: API did not return species data: " + code_nowiki (species_from_api));api_no_species_return_id_count++;return template;}if (status == status_from_api)// if status same as api statusiucn_status_confirmed_count++;// bump the confirmed count and doneelse{template = Regex.Replace (template, iucn_status_lead + status, "$1" + status_from_api);// updateiucn_status_updated_count++;// bump the updated count}return template;});//--------------------------- R E M O V E   D U P L I C A T E   S T A T U S   R E F >-------------------------//// convert |status_ref= {{cite IUCN}} template into a regex to find duplicates of itself in ArticleText and// then replace any duplicates with the |status_ref= self-closed tag from |status_ref=//// replaces duplicates in taxobox only after hiding the |status_ref= definition so that we don't lose the definition//// problem: if the duplicate is named and is the definition for other self-closed ref tags, all of those tags// need to be renamed ... argh example: [[Bellamya trochlearis]], [[Catarina pupfish]]//if ((null != taxobox_status_ref) && (null != taxobox_status_ref_sc_tag)){string taxobox_status_ref_pattern = taxobox_status_ref;foreach (string symbol in symbols)taxobox_status_ref_pattern = Regex.Replace (taxobox_status_ref_pattern, symbol, symbol);// convert taxobox_status_ref to a regex pattern// references in unordered lists always ok to replaceArticleText = counted_replace (ArticleText, bib_open_ul + taxobox_status_ref_pattern + bib_close_ul, "$1", ref duplicates_removed_count);// references with unnamed <ref> tags always ok to replaceArticleText = counted_replace (ArticleText, ref_open_tag_unnamed + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);taxobox = counted_replace (taxobox, ref_open_tag_unnamed + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);taxobox = hide_taxobox_status_ref (taxobox, taxobox_status_ref_open_tag, taxobox_status_ref_pattern);// hide |status_ref= {{cite IUCN}} template so we don't replace it with sc tagnamed_status_ref_dup_remove (ref ArticleText, ref taxobox, taxobox_status_ref_pattern, taxobox_status_ref_sc_tag);// remove duplicates// remove sequential instances of taxobox_status_ref_open_tag_sc TODO: this could be improvedstring taxobox_status_ref_open_tag_sc = Regex.Replace (taxobox_status_ref_open_tag, @"([^\>]+)\>", "$1 />");taxobox = Regex.Replace (taxobox, taxobox_status_ref_open_tag_sc + @"\s*" + taxobox_status_ref_open_tag_sc, taxobox_status_ref_sc_tag);ArticleText = Regex.Replace (ArticleText, taxobox_status_ref_open_tag_sc + @"\s*" + taxobox_status_ref_open_tag_sc, taxobox_status_ref_sc_tag);}//---------------------------< C L E A N U P >----------------------------------------------------------------if (null != taxobox)taxobox = unhide (taxobox);ArticleText = hide (ArticleText, "[Rr]eflist");while (Regex.Match (ArticleText, reflist_cleanup).Success)// remove self-closed ref tags from {{reflist}} (European fire-bellied toad){ArticleText = Regex.Replace (ArticleText, reflist_cleanup, "$1");ArticleText = Regex.Replace (ArticleText, @"(\{\{)\s*([Rr]eflist[^\|]*)\s*\|\s*refs\s*=\s*(\}\})", "$1$2$3");}ArticleText = unhide (ArticleText);if (null != taxobox)ArticleText = Regex.Replace (ArticleText, taxobox_blank_pattern, taxobox);ArticleText = Regex.Replace (ArticleText, angle_open, "<");ArticleText = Regex.Replace (ArticleText, angle_close, ">");//---------------------------< F I N I S H >------------------------------------------------------------------if (status_added)// build our edit summarySummary = summary_concat (Summary, " IUCN status added;");if (0 != iucn_status_confirmed_count)// build our edit summarySummary = summary_concat (Summary, " IUCN status confirmed" + ((1 < iucn_status_confirmed_count) ? " (" + iucn_status_confirmed_count + "×);" : ";"));if (0 != iucn_status_updated_count)Summary = summary_concat (Summary, " IUCN status updated" + ((1 < iucn_status_updated_count) ? " (" + iucn_status_updated_count + "×);" : ";"));if ((0 != iucn_status_confirmed_count) || (0 != iucn_status_updated_count) || status_added){if (0 != iucn_status_system_updated_count)Summary = summary_concat (Summary, " IUCN status system updated;");else if (status_system_added)Summary = summary_concat (Summary, " IUCN status system added;");}string dup_text = "";switch (duplicates_removed_count){case 0:dup_text = ";";break;case 1:dup_text = " [duplicate removed];";break;default:dup_text = " [duplicates removed (" + duplicates_removed_count + "×)];";break;}if (status_ref_added)Summary = summary_concat (Summary, " IUCN status ref added" + dup_text);if (status_ref_updated)Summary = summary_concat (Summary, "  IUCN status ref updated" + dup_text);if (status_ref_current)Summary = summary_concat (Summary, "  IUCN status ref current;");if (0 != plain_text_count)// build our edit summary{Summary = summary_concat (Summary, " evaluated " + plain_text_count + " reference" + (1 == plain_text_count ? ";" : "s;"));if (0 != plain_text_modified_count)Summary = summary_concat (Summary, " " + plain_text_modified_count + " reference" + (1 == plain_text_modified_count ? " " : "s ") + "modified;");}if (0 != iucn_template_count){Summary = summary_concat (Summary, " evaluated " + iucn_template_count + " {{cite IUCN}}" + (1 == iucn_template_count ? ";" : "s;"));if (0 != template_modified_count)Summary = summary_concat (Summary, " " + template_modified_count + " template" + (1 == template_modified_count ? " " : "s ") + "modified;");}if ((0 != other_template_count) && (0 != other_template_modified_count))// only report 'other templates' when we modify{Summary = summary_concat (Summary, " evaluated " + other_template_count + " other template" + (1 == other_template_count ? ";" : "s;"));if (0 != other_template_modified_count)Summary = summary_concat (Summary, " " + other_template_modified_count + " template" + (1 == other_template_modified_count ? " " : "s ") + "modified;");}if (0 != page_doi_skip_count)Summary = summary_concat (Summary, " skipped doi/page mismatch (" + page_doi_skip_count + "×);");if (0 != api_no_cite_return_count)Summary = summary_concat (Summary, " API cite nil return (" + api_no_cite_return_count + "×);");if (0 != api_no_species_return_id_count)// for {{IUCN status}}Summary = summary_concat (Summary, " API species nil return (id) (" + api_no_species_return_id_count + "×);");if (0 != api_no_species_return_name_count)Summary = summary_concat (Summary, " API species nil return (name) (" + api_no_species_return_name_count + "×);");if (null != unrecognized_species_name)Summary = summary_concat (Summary, " unrecognized binomial: " + unrecognized_species_name + ";");stopwatch.Stop();// stop the stopwatchTimeSpan ts = stopwatch.Elapsed;// get the elapsed time and tack it onto the edit summarySummary = Summary + " (" + api_call_count + "/" + String.Format("{0:00}:{1:00}.{2:00}", ts.Minutes, ts.Seconds, ts.Milliseconds / 10) + ");";if (!status_ref_added && !status_ref_updated && (0 == iucn_status_updated_count))// iucn_status_updated_count for {{IUCN status}} updates (List of reptiles of North America){if (0 == iucn_template_count){if ((0 != plain_text_count) && (plain_text_count == page_doi_skip_count)){error_log_add ("auto-skipped: doi/page mismatch");Skip = true;}if ((0 != plain_text_count) && (plain_text_count == api_no_cite_return_count)){error_log_add ("auto-skipped: number of cite IUCN templates is same as number of API citation nil returns");Skip = true;}}if (0 == plain_text_count){if ((0 != iucn_template_count) && (iucn_template_count == page_doi_skip_count)){error_log_add ("auto-skipped: doi/page mismatch");Skip = true;}if ((0 != iucn_template_count) && (iucn_template_count == api_no_cite_return_count)){error_log_add ("auto-skipped: number of plain-text citations is same as number of API citation nil returns");Skip = true;}}}if ("" == ArticleText)// trap to see if the 'blanked' pages that sometimes occur are the fault of this script{error_log_add ("auto-skipped: ArticleText is empty string");// error messageSkip = true;// force a skip}if (0 != error_log_list.Count)log_errors (ArticleTitle, error_log_list);return ArticleText;}//===========================<< S U P P O R T >>==============================================================//---------------------------< N A M E D _ S T A T U S _ R E F _ D U P _ R E M O V E >------------------------////////private string named_status_ref_dup_remove (ref string text, string taxobox_status_ref_pattern, string taxobox_status_ref_sc_tag)//{//Match dup_match = Regex.Match (text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");//if (dup_match.Success)//{//string name = dup_match.Groups[1].Value;// get the reference's name from <ref name=...> tag//string ref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*\>";// make a <ref name=... > pattern from name//string sc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";// make a self-closed <ref name=... /> pattern from name//text = Regex.Replace (text, sc_replace_pattern, taxobox_status_ref_sc_tag);// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tag//text = counted_replace (text, ref_open_tag_named + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);// now remove any duplicates//return sc_replace_pattern;//}//return null;//}private void named_status_ref_dup_remove (ref string article_text, ref string taxobox, string taxobox_status_ref_pattern, string taxobox_status_ref_sc_tag){Match dup_match;string name = null;string ref_tag_replace_pattern = null;string sc_replace_pattern = null;dup_match = Regex.Match (taxobox, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");while (dup_match.Success){name = dup_match.Groups[1].Value;// get the reference's name from <ref name=...> tagref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*\>";// make a <ref name=... > pattern from namesc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";// make a self-closed <ref name=... /> pattern from nametaxobox = Regex.Replace (taxobox, sc_replace_pattern, taxobox_status_ref_sc_tag);// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tagarticle_text = Regex.Replace (article_text, sc_replace_pattern, taxobox_status_ref_sc_tag);// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tagtaxobox = counted_replace (taxobox, ref_tag_replace_pattern + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);// now remove any duplicatesdup_match = Regex.Match (taxobox, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");}dup_match = Regex.Match (article_text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");while (dup_match.Success){name = dup_match.Groups[1].Value;// get the reference's name from <ref name=...> tagref_tag_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?" + name + @"""?\s*\>";// make a <ref name=... > pattern from namesc_replace_pattern = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""""?" + name + @"""""?\s*/\>";// make a self-closed <ref name=... /> pattern from namearticle_text = Regex.Replace (article_text, sc_replace_pattern, taxobox_status_ref_sc_tag);// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tagtaxobox = Regex.Replace (taxobox, sc_replace_pattern, taxobox_status_ref_sc_tag);// replace any <ref name=... /> with <ref name="iucn status <date> /> sc tagarticle_text = counted_replace (article_text, ref_tag_replace_pattern + @"\s*" + taxobox_status_ref_pattern + @"\s*" + ref_close_tag, taxobox_status_ref_sc_tag, ref duplicates_removed_count);// now remove any duplicatesdup_match = Regex.Match (article_text, @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?([^""\>]+)""?\>\s*" + taxobox_status_ref_pattern + @"\s*\</[Rr][Ee][Ff]\>");}}//---------------------------< H I D E _ T A X O B O X _ S T A T U S _ R E F >--------------------------------//////private string hide_taxobox_status_ref (string taxobox, string taxobox_status_ref_open_tag, string taxobox_status_ref_pattern){Match dup_match = Regex.Match (taxobox, "(" + taxobox_status_ref_open_tag +")(" + taxobox_status_ref_pattern + ")");// look for and capture |status_ref= definitionif (dup_match.Success){string hidden_status_ref = hide (dup_match.Groups[2].Value, IS_TAXOBOX);// spoof to hide {{cite IUCN}} in |status_ref=return Regex.Replace (taxobox, "(" + taxobox_status_ref_open_tag +")(" + taxobox_status_ref_pattern + ")", "$1" + hidden_status_ref);// replace with the hidden definition}return taxobox;}//---------------------------< I U C N   P L A I N - T E X T   R E F E R E N C E   U P D A T E >--------------//// this is the plain-text form API id only.  Plain-text citations must be wrapped with <ref ...>...</ref> tags//// known issues://because this attempts to locate 'correct' plain-text citations and because any non-template and non-//wikilink text is plain text, plain text inside <ref ...>...</ref> that is not part of the actual IUCN//citation will be treated as part of the citation and will be replaced with the {{cite IUCN}} template//if the API returns a citation for the taxon id.////does not update plain-text references in the taxobox (|status_ref= handled above); example: [[Picea abies]]//private string plain_text_ref_update (string text, string article_title){if (Regex.Match (text, plain_text_ref_pattern).Success)// must have the form <ref ...>plain text</ref> must be constrained because article is plain texttext = Regex.Replace (text, plain_text_ref_pattern,delegate(Match match){stringplain_text = match.Groups[0].Value;// this will be returned if no changesstringtaxon_id = plain_text_taxon_id_get (plain_text);// attempt to get taxon idif (null == taxon_id)return plain_text;// no taxon id so abandonif (is_plain_text_rejected (plain_text))// returns true when plain_text is rejectedreturn plain_text;stringref_open = match.Groups[1].Value.Trim();// the opening <ref> tagstringref_close = match.Groups[3].Value.Trim();// the closing </ref> tagplain_text_count++;// bump total number of plain-text references foundstring api_url = api_id_url + taxon_id + api_token;// build the url from its various partsstring cite_iucn = cite_iucn_get (api_url, null, article_title, taxon_id, null);// go build a {{cite IUCN}} template from the apiif (null == cite_iucn)return plain_text;// template build failedplain_text_modified_count++;return ref_open + cite_iucn + ref_close;});return text;}//---------------------------< T A X O B O X _ G E T >--------------------------------------------------------//// gets the {{taxobox}} or {{speciesbox}} template from <article_text>//private string taxobox_get (string article_text){if (Regex.Match (article_text, taxobox_template_pattern).Success)return Regex.Match (article_text, taxobox_template_pattern).Groups[0].Value;return null;}//---------------------------< T A X O B O X _ U P D A T E >--------------------------------------------------//// updates |status=, |status_system=, and |status_ref= parameters; returns true when updated; false else//private bool taxobox_update (ref string taxobox, ref string article_text, string article_title){if (null == taxobox)// if no taxoboxreturn false;taxobox_blank = Regex.Replace (taxobox, taxobox_template_pattern, "$1$3");taxobox = Regex.Replace (taxobox, stray_dot, "$1");// delete stray . because I found one suchtaxobox = Regex.Replace (taxobox, stray_splat, "$1");// delete stray * because I found one suchtaxobox = Regex.Replace (taxobox, stray_equal, "$1");// delete stray = because I found one suchtaxobox = Regex.Replace (taxobox, stray_nbsp, "$1");// delete stray &nbsp; because I found one suchtaxobox = Regex.Replace (taxobox, html_comment, "$1");// and html comments (Euconocephalus remotus)stringtaxobox_status_val = null;stringtaxobox_status_system_val = null;stringtaxobox_status_ref_val = null;stringtaxobox_status_ref_type = null;stringtaxobox_status_ref_name = null;// original name from <ref name="original name"> or <ref name="original name" />booltaxobox_status_ref_is_empty = false;stringtaxobox_status_date = null;inttaxobox_status_date_diff = 100;stringtaxobox_species_name_val = null;stringapi_status_val = null;stringapi_status_system_val = null;taxobox_species_name_val = taxobox_species_name_get (taxobox, article_title);// get species name from taxobox or article titleif (api_species_data_get (taxobox_species_name_val, ref api_status_val, ref api_status_system_val, article_title)){// when here presume that we can also get citation data from apitaxobox_status_val = taxobox_status_get (taxobox);taxobox_status_system_val = taxobox_system_get (taxobox);if ((((null != taxobox_status_val) && is_iucn_status (taxobox_status_val)) ||// has a value that is an IUCN status or((null != taxobox_status_system_val) && is_iucn_system (taxobox_status_system_val))) ||// has a value that is an IUCN system or((null == taxobox_status_val) && (null == taxobox_status_system_val)))// both are missing or empty{taxobox_status_update (ref taxobox, api_status_val, taxobox_status_val);taxobox_system_update (ref taxobox, api_status_system_val, taxobox_status_system_val);}elsereturn false;taxobox_status_ref_val = taxobox_status_ref_get (taxobox, ref taxobox_status_ref_is_empty);if (null != taxobox_status_ref_val){if (Regex.Match (taxobox_status_ref_val, amended_text).Success){error_log_add ("taxobox_update(): plain-text |status_ref= has amended text");return false;}if (Regex.Match (taxobox_status_ref_val, errata_text).Success){error_log_add ("taxobox_update(): plain-text |status_ref= has errata text");return false;}if (Regex.Match (taxobox_status_ref_val, @"__P1P3__\s*(?:errata|amends)\s*=\s*\d{4}").Success){error_log_add ("taxobox_update(): |status_ref= citation has |errata= or |amends= parameter");return false;}}taxobox_status_ref_type = taxobox_status_ref_type_get (taxobox_status_ref_val, ref taxobox_status_ref_name);string api_url = null;if (("named" == taxobox_status_ref_type) || ("unnamed" == taxobox_status_ref_type) || (null == taxobox_status_ref_type)){if (null != taxobox_status_ref_val){taxobox_status_date = taxobox_status_date_get (taxobox_status_ref_val, taxobox_status_ref_name);taxobox_status_date_diff = taxobox_status_date_diff_get (taxobox_status_date);}if (6 < taxobox_status_date_diff){api_url = api_name_url + taxobox_species_name_val + api_token;// build citation url from its various partstaxobox_status_ref = cite_iucn_get (api_url, null, article_title, null, taxobox_species_name_val);// go build a {{cite IUCN}} template from the apiif (null == taxobox_status_ref)return false;// template build failednew_ref_tags_make (taxobox_status_ref, ref taxobox_status_ref_sc_tag, ref taxobox_status_ref_open_tag);if (null == taxobox_status_ref_val)// if empty or missing{if (taxobox_status_ref_is_empty){taxobox = Regex.Replace (taxobox, taxobox_status_ref_empty_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>$2");status_ref_added = true;}else// here when |status_ref= is missing{taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status_ref=" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>$2$3");status_ref_added = true;}}else{taxobox = Regex.Replace (taxobox, taxobox_status_ref_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>");if ("named" == taxobox_status_ref_type)// go rename all of the self-closed ref tags in article text and in the taxobox{article_text = Regex.Replace (article_text, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);taxobox = Regex.Replace (taxobox, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);}status_ref_updated = true;}}elsestatus_ref_current = true;}else if ("named_sc" == taxobox_status_ref_type){if (Regex.Match (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end).Success){taxobox_status_ref_val = Regex.Match (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end).Groups[0].Value;taxobox_status_ref_val = unhide (taxobox_status_ref_val);taxobox_status_date = taxobox_status_date_get (taxobox_status_ref_val, taxobox_status_ref_name);taxobox_status_date_diff = taxobox_status_date_diff_get (taxobox_status_date);if (6 < taxobox_status_date_diff){api_url = api_name_url + taxobox_species_name_val + api_token;// build citation url from its various partstaxobox_status_ref = cite_iucn_get (api_url, null, article_title, null, taxobox_species_name_val);// go build a {{cite IUCN}} template from the apiif (null == taxobox_status_ref)return false;// template build failednew_ref_tags_make (taxobox_status_ref, ref taxobox_status_ref_sc_tag, ref taxobox_status_ref_open_tag);// replace original definition with new sc ref tagarticle_text = Regex.Replace (article_text, ref_def_begin + taxobox_status_ref_name + ref_def_end, taxobox_status_ref_sc_tag);// replace original |status_ref= sc ref tag with new definitiontaxobox = Regex.Replace (taxobox, taxobox_status_sc_ref_pattern, "$1" + taxobox_status_ref_open_tag + taxobox_status_ref + "</ref>");// rename original sc ref tagsarticle_text = Regex.Replace (article_text, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);taxobox = Regex.Replace (taxobox, sc_ref_tag_begin + taxobox_status_ref_name + sc_ref_tag_end, taxobox_status_ref_sc_tag);status_ref_updated = true;}}elseerror_log_add ("taxobox_update(): no definition for: " + code_nowiki (taxobox_status_ref_val));}else{error_log_add ("taxobox_update(): no " + code_nowiki ("|status_ref="));}}else// here when binomial is not recognized by iucn{if (null != taxobox_species_name_val){taxobox_status_val = taxobox_status_get (taxobox);// if either of these then add a maintenance category and ...taxobox_status_system_val = taxobox_system_get (taxobox);// ... save unrecognized binomial for edit summary only when ...if ((((null != taxobox_status_val) && is_iucn_status (taxobox_status_val)) ||// ... |status= has a value that is an IUCN status or((null != taxobox_status_system_val) && is_iucn_system (taxobox_status_system_val))) ||// |status_system= has a value that is an IUCN system or((null == taxobox_status_val) && (null == taxobox_status_system_val)))// both are missing or empty (example: Barlow's lark){unrecognized_species_name = Uri.UnescapeDataString (taxobox_species_name_val);// remove percent encodingstring cat_plus_name = "[[Category:Taxobox binomials not recognized by IUCN]]" + " <!-- " + unrecognized_species_name + " -->";MatchCollection matches = Regex.Matches (article_text, @"__WL1NK_O__[Cc]ategory:.+__WL1NK_C__");// find all of the categoriesif (0 != matches.Count)// non-zero when categories found{int index = matches.Count - 1;// make an indexer from Count and then replace last one with itself + our categoryarticle_text = Regex.Replace (article_text, matches[index].Value, matches[index].Value + '\n' + cat_plus_name);}else// here when no categories; look for stub templates{matches = Regex.Matches (article_text, @"__0P3N__.+\-stub__CL0S3__");// find all of the stub templatesif (0 != matches.Count)// non-zero when stub templates foundarticle_text = Regex.Replace (article_text, matches[0].Value, cat_plus_name + '\x0A' + '\x0A' + matches[0].Value);else// here when no categories and no stub templatesarticle_text = article_text + '\x0A' + cat_plus_name;// no cats and no stub templates, add to the end}// binomial may not be recognized for a global assessment but is recognized for a regional assessment;// this script cannot know which region so cannot use the regional form of the citation API call:///api/v3/species/citation/:name/region/:region_identifier?token='YOUR TOKEN'// binomial may be recognized in iucn search box (as a redirect-like name) but that is not available// to the API (and if it were probably shouldn't be used)}}}taxobox = unhide (taxobox);article_text = Regex.Replace (article_text, taxobox_template_pattern, taxobox_blank);// install a blank so that we don't spend time evaluating the citation in |status_ref=return true;}//---------------------------< N E W _ S E L F _ C L O S E D _ T A G S _ M A K E >----------------------------//// makes self-closed and normal <ref> tags for new |status_ref= {{cite IUCN}} reference using |access-date= from// the {{cite IUCN}} template//private void new_ref_tags_make (string cite_iucn, ref string new_self_closed_tag, ref string taxobox_status_ref_open_tag){string date = Regex.Match (cite_iucn, access_date).Groups[1].Value.Trim();// date from new {{cite IUCN}} |access-date=new_self_closed_tag = @"<ref name=""iucn status " + date + @""" />";// make a version to replace short-form ref tags that need to be renamedtaxobox_status_ref_open_tag = @"<ref name=""iucn status " + date + @""">";// make a version for |status_ref=}//---------------------------< T A X O B O X _ S T A T U S _ G E T >------------------------------------------//// gets value assigned to {{taxobox}} or {{speciesbox}} |status= parameter; returns that value; status validation// is done by calling function; returns null if |status= is missing or empty.//private string taxobox_status_get (string taxobox_template){if (!Regex.Match (taxobox_template, taxobox_status_missing).Success || Regex.Match (taxobox_template, taxobox_status_empty).Success)return null;// |status= is missing or emptyreturn Regex.Match (taxobox_template, taxobox_status_value).Groups[2].Value.Trim();}//---------------------------< I S _ I U C N _ S T A T U S >--------------------------------------------------//// return true if <status> is known IUCN category; false else//private bool is_iucn_status (string status){if (null == status)return false;return Regex.Match (status, IS_IUCN_STATUS).Success;}//---------------------------< T A X O B O X _ S T A T U S _ U P D A T E >------------------------------------//// updates, adds, or confirms |status= in taxobox using value from iucn API//private void taxobox_status_update (ref string taxobox, string api_status_val, string taxobox_status_val){if (null == api_status_val)// did api return species data with IUCN category?return;if (!Regex.Match (taxobox, taxobox_status_missing).Success)// if |status= not in taxobox{taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status=" + api_status_val + "$2$3");status_added = true;}else if (api_status_val != taxobox_status_val){taxobox = Regex.Replace (taxobox, taxobox_status_pattern, "$1" + api_status_val + "$2");iucn_status_updated_count++;}else// here when <api_status_val> == <taxobox_status_val>iucn_status_confirmed_count++;// bump the confirmed count and done}//---------------------------< T A X O B O X _ S Y S T E M _ G E T >------------------------------------------//// gets value assigned to {{taxobox}} or {{speciesbox}} |status_system= parameter; returns that value; status_system// validation is done by calling function; returns null if |status_system= is missing or empty.//private string taxobox_system_get (string taxobox_template){if (!Regex.Match (taxobox_template, taxobox_system_missing).Success || Regex.Match (taxobox_template, taxobox_system_empty).Success)return null;// |status= is missing or emptyreturn Regex.Match (taxobox_template, taxobox_system_value).Groups[2].Value.Trim();}//---------------------------< I S _ I U C N _ S Y S T E M >--------------------------------------------------//// return true if <system> is known IUCN category; false else//private bool is_iucn_system (string system){if (null == system)return false;return Regex.Match (system, IS_IUCN_SYSTEM).Success;}//---------------------------< T A X O B O X _ S Y S T E M _ U P D A T E >------------------------------------//// updates, adds, or confirms |status_system= in taxobox using value from iucn API//private void taxobox_system_update (ref string taxobox, string api_status_system_val, string taxobox_status_system_val){if (null == api_status_system_val)// did api return species data with IUCN category?return;if (!Regex.Match (taxobox, taxobox_system_missing).Success)// if |status_system= not in taxobox{taxobox = Regex.Replace (taxobox, taxobox_new_stat_sys_ref_pattern, "$1$2|status_system=" + api_status_system_val + "$2$3");status_system_added = true;}else if (api_status_system_val != taxobox_status_system_val){taxobox = Regex.Replace (taxobox, taxobox_system_pattern, "$1" + api_status_system_val + "$2");iucn_status_system_updated_count++;}}//---------------------------< T A X O B O X _ S T A T U S _R E F _ G E T >-----------------------------------//// gets value assigned to {{taxobox}} or {{speciesbox}} |status_system= parameter; returns that value; ref tags,// ref name, and reference text extracted by calling function//private string taxobox_status_ref_get (string taxobox, ref bool taxobox_status_ref_is_empty){if (!Regex.Match (taxobox, taxobox_status_ref_missing).Success)return null;// |status= is missingif (Regex.Match (taxobox, taxobox_status_ref_empty).Success){taxobox_status_ref_is_empty = true;return null;// |status= is empty}return Regex.Match (taxobox, taxobox_status_ref_value).Groups[2].Value.Trim();}//---------------------------< T A X O B O X _ S T A T U S _ R E F _ T Y P E _ G E T >------------------------//// look at opening <ref> tag and return its type (order of evaluation is important here://<ref> returns 'unnamed'//<ref ... name = .../>returns 'named_sc'//<ref ... name = ...> returns 'named'// if none of these, or <taxobox_status_ref_val> is null, returns null//private string taxobox_status_ref_type_get (string taxobox_status_ref_val, ref string taxobox_status_ref_name){if (null == taxobox_status_ref_val)return null;if (Regex.Match (taxobox_status_ref_val, ref_tag_unnamed_pattern).Success)return "unnamed";if (Regex.Match (taxobox_status_ref_val, ref_tag_named_sc_pattern).Success)// order here important; named_sc test before named test{taxobox_status_ref_name = Regex.Match (taxobox_status_ref_val, ref_tag_named_sc_pattern).Groups[2].Value.Trim();return "named_sc";}if (Regex.Match (taxobox_status_ref_val, ref_tag_named_pattern).Success)// order here important; named test after named_sc test{taxobox_status_ref_name = Regex.Match (taxobox_status_ref_val, ref_tag_named_pattern).Groups[2].Value.Trim();return "named";}return null;// should never get here}//---------------------------< T A X O B O X _ S T A T U S _ D A T E _ G E T >--------------------------------//// attempt to get date of last status update from ref tag (<ref name="iucn status 29 September 2021">) or from// |access-date= value//private string taxobox_status_date_get (string taxobox_status_ref_val, string taxobox_status_ref_name){if ((null != taxobox_status_ref_name) && Regex.Match (taxobox_status_ref_name, preferred_status_ref_tag_name).Success)return Regex.Match (taxobox_status_ref_name, preferred_status_ref_tag_name).Groups[1].Value.Trim();taxobox_status_ref_val = unhide (taxobox_status_ref_val);if (Regex.Match (taxobox_status_ref_val, access_date).Success)return Regex.Match (taxobox_status_ref_val, access_date).Groups[1].Value.Trim();// date from |access-date=return null;}//---------------------------< T A X O B O X _ S T A T U S _ D A T E _ D I F _ G E T >------------------------//// return the difference in months between today's date and a date from the |status_ref= <ref> tag or from the// |status_ref= citation's |access-date=//// script will not update |status_ref= if date difference is less than 7 months//private int taxobox_status_date_diff_get (string date){if (null == date){//error_log_add ("taxobox_status_date_diff_get(): nil date value; forcing update");// not really an errorreturn 100;// any value greater than 6 forces citation update attempt}intcurrent_month = DateTimeOffset.Now.Month;intcurrent_year = DateTimeOffset.Now.Year;stringmonth = null;stringyear = null;foreach(KeyValuePair<string, string> date_pattern in date_patterns){Match match = Regex.Match (date, date_pattern.Value);if (match.Success){if ("ymd" == date_pattern.Key)// because year precedes month, Group[1] and Group[2] are ordered differently{month = match.Groups[2].Value.Trim().ToLower();year = match.Groups[1].Value.Trim();}else// here when dmy or mdy{month = match.Groups[1].Value.Trim().ToLower();year = match.Groups[2].Value.Trim();}}}if ((null == month) || (null == year)){error_log_add ("taxobox_status_date_diff_get(): month and/or year null; forcing update");error_log_add ("year: " + year);error_log_add ("month: " + month);return 100;// any value greater than 6 forces citation update attempt}if (months.ContainsKey (month))return ((current_year - Int32.Parse(year)) * 12) + current_month - months[month];else{error_log_add ("taxobox_status_date_diff_get(): month not recognized: " + month + "; forcing update");return 100;}}//---------------------------< T A X O B O X _ S P E C I E S _ N A M E _ G E T >------------------------------//// attempts to get binomial from various parameters in {{taxobox}} or {{speciesbox}} and failing that the article// title.//// taxobox: |binomial= -> |name= -> article title// speciesbox: |taxon= -> |genus= + |species= -> |name= -> article title//// returns null when <name> is not binomial-like (two words); example [[Africanogyrus]]//private string taxobox_species_name_get (string taxobox, string article_title){stringtemplate_name = Regex.Match (taxobox, taxobox_template_pattern).Groups[2].Value.ToLower();// capture is the template name (Taxobox, Speciesbox, etc)stringname = null;// name of this species from various possible parameters in the taxobox templateif ("taxobox" == template_name){if (Regex.Match (taxobox, binomial_pattern).Success)name = Regex.Match (taxobox, binomial_pattern).Groups[1].Value.Trim();// use |binomial=else if (Regex.Match (taxobox, name_pattern).Success)name = Regex.Match (taxobox, name_pattern).Groups[1].Value.Trim();// fallback to |name=}else if ("speciesbox" == template_name){if (Regex.Match (taxobox, taxon_pattern).Success)name = Regex.Match (taxobox, taxon_pattern).Groups[1].Value.Trim();// use |taxon=else if (Regex.Match (taxobox, genus_pattern).Success && Regex.Match (taxobox, species_pattern).Success)name = Regex.Match (taxobox, genus_pattern).Groups[1].Value.Trim() + " " + Regex.Match (taxobox, species_pattern).Groups[1].Value.Trim();else if (Regex.Match (taxobox, name_pattern).Success)name = Regex.Match (taxobox, name_pattern).Groups[1].Value.Trim();// fallback to |name=}if (null == name)// when none of the above{name = article_title;// TODO: don't use article title?error_log_add ("using article title");}name = species_name_cleanup (name);// remove markup, extinction markers, disambiguation, etcif (!Regex.Match (Uri.UnescapeDataString (name), @"[A-Za-z]+ [A-Za-z]+").Success)// does <name> look like a binomial?{error_log_add ("name not a binomial: " + name);return null;}return name;}//---------------------------< T A X O N _ I D _ O L D _ F O R M _ U R L _ G E T >----------------------------//// loops through a series of old-form IUCN urls and returns the taxon id if the pattern matches; null else//private string taxon_id_from_old_form_url_get (string text){foreach (string url_pattern in url_patterns)// loop through a series of old-form url patterns{Match url_match = Regex.Match (text, url_pattern);if (url_match.Success)// if foundreturn url_match.Groups[1].Value.Trim();// extract and return the taxon id}return null;}//---------------------------< P L A I N _ T E X T _ T A X O N _ I D _ G E T >--------------------------------//// extract taxon id from IUCN page, doi, or url.  For plain-text citations, accept any form of iucn url when// attempting to get the taxon id; prefer page -> doi -> url; returns taxon id if available, null else//private string plain_text_taxon_id_get (string plain_text){if (Regex.Match (plain_text, plain_text_page_taxon_id).Success)// get taxon id from page?return Regex.Match (plain_text, plain_text_page_taxon_id).Groups[1].Value;if (Regex.Match (plain_text, plain_text_doi_taxon_id).Success)// get taxon id from doi?return Regex.Match (plain_text, plain_text_doi_taxon_id).Groups[1].Value;if (Regex.Match (plain_text, plain_text_taxon_id_url).Success)// get taxon id from url?return Regex.Match (plain_text, plain_text_taxon_id_url).Groups[1].Value;return null;// couldn't find taxon id; might not be iucn reference}//---------------------------< I S _ P L A I N _ T E X T _ R E J E C T E D >----------------------------------//// evaluates <plain_text> looking for things that oughtn't to be there or that are not currently supported// returns true when <plain_text> is rejected; null else//private bool is_plain_text_rejected (string plain_text){if (Regex.Match (plain_text, @"\{\{\s*[Cc]it[ae]").Success)// if 'plain text' has {{cit...}} template{//error_log_add ("is_plain_text_rejected(): plain-text has cite template: " + plain_text);// don't do this because it alarms on valid cite IUCN templatesreturn true;// skip this reference}if (Regex.Match (plain_text, amended_text).Success){error_log_add ("is_plain_text_rejected(): plain-text has amended text");return true;// because API doesn't yet identify amended assessment year}if (Regex.Match (plain_text, errata_text).Success){error_log_add ("is_plain_text_rejected(): plain-text has errata text");return true;// because API doesn't yet identify errata assessment year}return false;}//---------------------------< S P E C I E S _ N A M E _ C L E A N U P >--------------------------------------//// removes stuff that isn't part of the binomial; returns name modified or not.//private string species_name_cleanup (string name){name= Regex.Replace (name, "__4ng13_0__", "<");// unhide html comments that might be part of <name>name= Regex.Replace (name, "__4ng13_C__", ">");foreach (string [] cleanup_pattern in cleanup_patterns)name = Regex.Replace (name, cleanup_pattern[0], cleanup_pattern[1]);name = name.Trim();// and remove any leading/trailing whitespacename = Uri.EscapeDataString (name);// percent encode uri reserved charactersreturn name;}//---------------------------< C I T E _ I U C N _ G E T >----------------------------------------------------//// creates {{cite IUCN}} template from api call.  Tries <first_url> first and if successful ignores <second_url>// tries <second_url> else//private string cite_iucn_get (string first_url, string second_url, string ArticleTitle, string taxon_id, string species_name){string citation_from_api = null;string raw_citation = null;if ((null == first_url) && (null == second_url))return null;var urls = new List<string>();urls.Add (first_url);urls.Add (second_url);foreach (string url in urls){if (null != url){citation_from_api = api_fetch (url, ArticleTitle);// fetch citation from the IUCN APIif (null == citation_from_api)return null;if (Regex.Match (citation_from_api, citation_from_api_pattern).Success){raw_citation = Regex.Match (citation_from_api, citation_from_api_pattern).Groups[1].Value.Trim();break;}}}if (null == raw_citation)// <raw_citation> must have a value{string text = "cite_iucn_get(): API did not return citation:";if (null != taxon_id)text = text + " id: " + taxon_id;if (null != species_name)text = text + " name: " + species_name;text = text + " " + code_nowiki (citation_from_api);error_log_add (text);api_no_cite_return_count++;return null;}string author_list = "";string date = "";string title = "";string volume = "";string page = "";string page_assessment = "";string doi = "";string doi_assessment = "";string access_date = "";Match parse = Regex.Match (raw_citation, parse_pattern);if (parse.Success){author_list = author_names_get (parse.Groups[1].Value.Trim());date = @" |date=" + parse.Groups[2].Value.Trim();title = title_get (parse.Groups[3].Value.Trim());volume = @" |volume=" + parse.Groups[4].Value.Trim();page = @" |page=" + parse.Groups[5].Value.Trim();page_assessment = parse.Groups[6].Value.Trim();doi = @" |doi=" + parse.Groups[7].Value.Trim();doi_assessment = parse.Groups[8].Value.Trim();access_date = @" |access-date=" + parse.Groups[9].Value.Trim();}else{error_log_add ("cite_iucn_get(): parse failure: " + code_nowiki (citation_from_api));parse_fail_count++;return null;}if (page_assessment != doi_assessment)// until errata date information available from the API{error_log_add ("cite_iucn_get(): doi/page mismatch: page assessment: " + code_nowiki (parse.Groups[5].Value.Trim()));page_doi_skip_count++;// skip template when page- and doi-assessment ids are mismatchedreturn null;}return @"{{cite IUCN" + author_list + date + title + volume + page + doi + access_date + @"}}";}//---------------------------< A P I _ S P E C I E S _ D A T A _ G E T >--------------------------------------//// using taxon name, attempt to get species data from the IUCN API.//private bool api_species_data_get (string taxobox_species_name_val, ref string api_status_val, ref string api_status_system_val, string article_title){if (null == taxobox_species_name_val)// when taxobox_species_name_get() can't get a binomial-like namereturn false;string api_url = api_species_url + taxobox_species_name_val + api_token;// build a url from its various parts (taxon name)string species_from_api = api_fetch (api_url, article_title);// fetch species data from the IUCN API (taxon name)if (null == species_from_api)// if the api call failedreturn false;// abandonif (Regex.Match (species_from_api, status_from_api_pattern).Success)// update <api_status_val> from api returnapi_status_val = Regex.Match (species_from_api, status_from_api_pattern).Groups[1].Value;if (Regex.Match (species_from_api, status_system_from_api_pattern).Success)// update <api_status_system_val> from api return{int year = Int32.Parse (Regex.Match (species_from_api, status_system_from_api_pattern).Groups[1].Value);// convert to an integerapi_status_system_val = ((2000 < year) ? "IUCN3.1" : "IUCN2.3");// and then convert to the appropriate status system}if ((null == api_status_val) || (null == api_status_system_val))// if either of these are null, declare an error{error_log_add ("api_species_data_get(): API did not return species data: " + code_nowiki (species_from_api));api_no_species_return_name_count++;return false;// and abandon}return true;}//---------------------------< A P I _ F E T C H >------------------------------------------------------------//// calls the iucn api with <api_url>; returns raw data string on success; null else.  Bumps the api call counter////private string api_fetch (string api_url, string ArticleTitle){if (0 < api_call_count)// pause here for 3 seconds if <api_call_count> is greater than 0 (pause is skipped for the first api access)System.Threading.Thread.Sleep (3000);// this prevents us from banging on the API too quicklyapi_call_count++;// bump the call counterstring string_from_api = null;try{// this WebRequest code courtesy of en.wiki editor User:DavidBrooksSystem.Net.HttpWebRequest webRequest = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(api_url);webRequest.UserAgent = "Wikipedia IUCN citation update experiment (https://www.search.com.vn/wiki/en/User:Trappist_the_monk)";System.IO.Stream str = webRequest.GetResponse().GetResponseStream();string_from_api = new System.IO.StreamReader(str).ReadToEnd();}catch{error_log_add ("api_fetch(): Exception occurred reading: " + code_nowiki (api_url));api_fetch_fail_count++;return null;}return string_from_api;}//---------------------------< A U T H O R _ N A M E S _ G E T >----------------------------------------------//// attempts to extract individual author names from iucn api citation.  Derived from [[Module:cite IUCN]] function// make_cite_iucn()//private string author_names_get (string raw_author_list){string collaboration = null;string pattern = @"(,\s+[A-Z]),";// for when iucn forgets to include final dotraw_author_list = Regex.Replace (raw_author_list, pattern, "$1" + ".,");pattern = @"(\.[A-Z]),";// for when iucn forgets to include final dotraw_author_list = Regex.Replace (raw_author_list, pattern, "$1" + ".,");pattern = @"\s\(([^\)]+)\)$";if (Regex.Match (raw_author_list, pattern).Success){collaboration = Regex.Match (raw_author_list, pattern).Groups[1].Value.Trim();// save the collaboration nameraw_author_list = Regex.Replace (raw_author_list, pattern, "");// remove collaboration from raw_author_list}raw_author_list = Regex.Replace (raw_author_list, @"\.?,?\s+&\s+", ".|");// replace <opt. dot><opt. comma><space><ampersand><space> with <dot><pipe>raw_author_list = Regex.Replace (raw_author_list, @"\.,\s+", ".|");// replace <dot><comma><space> with <dot><pipe>raw_author_list = Regex.Replace (raw_author_list, @"(\.[A-Z]),\s+", "$1.|");// special case where iucn drops the dot after an initialstring author_list = "";string[]authors = Regex.Split (raw_author_list, @"\|");// split the string on the <pipe>inti = 1;foreach (string author in authors){if (1 == i)author_list = author_list + " |author" + "=" + author;// don't enumerate first authorelseauthor_list = author_list + " |author" + i + "=" + author;i++;}if (null != collaboration)author_list = author_list + " |collaboration=" + collaboration;return author_list;}//---------------------------< T I T L E _ G E T >------------------------------------------------------------//// extracts title from iucn API citation; attempts to add markup so that it renders correctly//private string title_get (string raw_title){string title = null;// formatted title goes herestring errata = "";// errata year, if present, goes here; empty string for concatenationstring amends = "";// amends year, if present, goes here; empty string for concatenationstring pattern = null;string replace = null;foreach (string[] search_and_replace in search_and_replaces){pattern = search_and_replace[0];replace = search_and_replace[1];// replace includes wiki markup for titleif (Regex.Match (raw_title, pattern).Success){title = Regex.Replace (raw_title, pattern, replace);break;}}if (null == title){title = "''" + raw_title + "''";// pattern not found apply italic markup to raw_title from API citation//error_log_add ("title_get(): using raw title: " + raw_title);// not really an error}pattern = errata_text;// look for an errata string; as of 2021-10-01, errata string not available in API citationMatch match = Regex.Match (title, pattern);if (match.Success)errata = " |errata=" + match.Groups[1].Value.Trim();pattern = amended_text;// look for an amended string; as of 2021-10-01, amended string not available in API citationmatch = Regex.Match (title, pattern);if (match.Success)amends = " |amends=" + match.Groups[1].Value.Trim();return " |title=" + title + errata + amends;}//---------------------------< H I D E >----------------------------------------------------------------------//// HIDE TEMPLATES: find templates that are not <dont_hide>; replace the opening {{ with __0P3N__, the closing }}// with __CL0S3__, and internal | (pipes) with __P1P3__//// single curly braces in urls and other parameter values can confuse other regex in this code so replace {// with __0CU!21Y__ and } with __CCU!21Y__//private string hide (string ArticleText, string dont_hide){string pattern = @"\{\{(?!\s*" + dont_hide + @")[^\{\}]*\}\}";if (Regex.Match (ArticleText, pattern).Success){ArticleText = Regex.Replace(ArticleText, pattern,delegate(Match match){stringfixed_template;// a hidden template is assembled herestringraw_template = match.Groups[0].Value;// the whole templatepattern = @"\{\{";// hide the opening {{fixed_template = Regex.Replace (raw_template, pattern, "__0P3N__");pattern = @"\}\}";// hide the closing }}fixed_template = Regex.Replace (fixed_template, pattern, "__CL0S3__");pattern = @"\|";// and hide the pipesfixed_template = Regex.Replace (fixed_template, pattern, "__P1P3__");return fixed_template;});}pattern = @"(\<!\-{2,}\s*[^\>\|\}]*)\{\{(\s*" + dont_hide + @"[^\}]*)\}\}([^\>]*\-{2,}\>)";// <!-- {{citx...}} -->ArticleText = Regex.Replace(ArticleText, pattern, "$1__0P3N__$2__CL0S3__$3");pattern = @"\{\|";// open table markupArticleText = Regex.Replace(ArticleText, pattern, "__0T4BL3__");pattern = @"\|\}(?!\})";// close table markupArticleText = Regex.Replace(ArticleText, pattern, "__CT4BL3__");pattern = @"([^\{])\{([^\{])";// single opening curly braceArticleText = Regex.Replace(ArticleText, pattern, "$1__0CU!21Y__$2");pattern = @"([^\}])\}([^\}])";// single closing curly braceArticleText = Regex.Replace(ArticleText, pattern, "$1__CCU!21Y__$2");pattern = @"\[\[(?![Ff]ile|[Ii]mage)([^\|\]]+)\|([^\]]+)\]\]";// HIDE complex wikilinks: [[article title|label]] to __WL1NK_O__article title__P1P3__label__WL1NK_C__ArticleText = Regex.Replace(ArticleText, pattern, "__WL1NK_O__$1__P1P3__$2__WL1NK_C__");// [[File: with wikilinks inside can be confusingpattern = @"\[\[([^\]]+)\]\]";// HIDE simple wikilinks: [[article title]] to __WL1NK_O__article title__WL1NK_C__ArticleText = Regex.Replace(ArticleText, pattern, "__WL1NK_O__$1__WL1NK_C__");return ArticleText;}//---------------------------< U N H I D E >------------------------------------------------------------------//// UNHIDE TEMPLATES: find templates and wikilinks that are hidden; replace the 'hide' keywords with the// appropriate wiki markup//private string unhide (string ArticleText){ArticleText = Regex.Replace(ArticleText, @"__WL1NK_O__", "[[");// UNHIDE: replace __WL1NK_O__ with [[ArticleText = Regex.Replace(ArticleText, @"__WL1NK_C__", "]]");// UNHIDE: replace __WL1NK_C__ with ]]ArticleText = Regex.Replace(ArticleText, @"__P1P3__", "|");// UNHIDE: replace __P1P3__ with |ArticleText = Regex.Replace(ArticleText, @"__0T4BL3__", "{|");// UNHIDE: replace __0T4BL3__ with {|ArticleText = Regex.Replace(ArticleText, @"__CT4BL3__", "|}");// UNHIDE: replace __CT4BL3__ with |}ArticleText = Regex.Replace(ArticleText, @"__0CU!21Y__", "{");// UNHIDE: replace __0CU!21Y__ with {ArticleText = Regex.Replace(ArticleText, @"__CCU!21Y__", "}");// UNHIDE: replace __CCU!21Y__ with }ArticleText = Regex.Replace(ArticleText, @"__0P3N__", "{{");// UNHIDE: replace __0P3N__ with {{ArticleText = Regex.Replace(ArticleText, @"__CL0S3__", "}}");// UNHIDE: replace __CL0S3__ with }}return ArticleText;}//---------------------------< S U M M A R Y _ C O N C A T >--------------------------------------------------//// concatenates text onto an existing edit summary string, limiting the string to a length of no more than 347// characters.  When <summary> appended with <text> would be longer than the allowed 347 character limit, this// function replaces <text> with an ellipsis.  Once an ellipsis is added, no more <text> can be added to <summary>//private string summary_concat (string summary, string text){if (0 <= summary.IndexOf ("..."))// if ellipsis already present in <summary>, abandonreturn summary;if (347 >= (summary.Length + text.Length + 3))// if adding <text> to summary will overrun the 347 char limit (+ 3 to make sure we can add ellipsis if necessary)return summary + text;// append <text> to <summary> and donereturn summary + "...";// append ellipsis instead}//---------------------------< C O D E _ N O W I K I >--------------------------------------------------------//// wraps 'text' in <code><nowiki>text</nowiki></code> tags for error log//private string code_nowiki (string text){return "<code><nowiki>" + text + "</nowiki></code>";}//---------------------------< E R R O R _ L O G _ A D D >----------------------------------------------------//// adds an error message to the error log list.  Probably superfluous.//private void error_log_add (string message){error_log_list.Add (message);}//---------------------------< L O G _ E R R O R S >----------------------------------------------------------//// writes the content of the error log list to the log file, prettified with wiki markup.//private void log_errors (string article_title, List<string> error_log_list){System.IO.StreamWriter sw;stringtime = DateTimeOffset.Now.ToString("u").Substring (11, 9);stringdate = DateTimeOffset.Now.ToString("u").Substring (0, 10);stringlog_file = @"Z:\Wikipedia\AWB\Monkbot_tasks\Monkbot_task_19_cite_iucn_update\logs\" + date + ".txt";intseconds = DateTimeOffset.Now.Second;intminutes = DateTimeOffset.Now.Minute;inthours = DateTimeOffset.Now.Hour;sw = System.IO.File.AppendText (log_file);sw.WriteLine ("*[[" + article_title + "]] (" + time + "):");foreach (string list_item in error_log_list)sw.WriteLine ("*:" + list_item);error_log_list.Clear();sw.Close();}//---------------------------< C O U N T E D _ R E P L A C E >------------------------------------------------//// common function to replace <pattern> with <replace> and bump <count> until no more <pattern>//private string counted_replace (string template, string pattern, string replace, ref int count){Regex rgx = new Regex (pattern);// make a new regex from <pattern>while (Regex.Match (template, pattern).Success)// look for <pattern> in <template>{template = rgx.Replace (template, replace, 1);// replace one copy of <pattern> with <replace>count++;// bump the counter}return template;}//===========================<< S T A T I C   D A T A >>======================================================static boolstatus_added = false;// set to true when |status= created in taxoboxstatic intplain_text_modified_count = 0;// number of plain-text citations that were modified from the iucn apistatic intplain_text_count = 0;// total number of plain-text iucn referencesstatic intapi_call_count = 0;// number of api calls made; this value not reported in edit summarystatic intapi_fetch_fail_count = 0;// number of api fetches that failedstatic intapi_no_cite_return_count = 0;// number of times that the api returned a non-citation value like: {"value":"0","species":"202965"}static intparse_fail_count = 0;// number of times that we couldn't parse the api returnstatic intpage_doi_skip_count = 0;// number of templates or plain-text references skipped because page and doi assessment ID mismatch (could be errata but since no errata date ...)static intapi_no_species_return_name_count = 0;// number of times that the api returned a non-species value (species name)static intapi_no_species_return_id_count = 0;// number of times that the api returned a non-species value (species id for {{IUCN status}})static intiucn_status_updated_count = 0;// number of times that we updated the iucn status in taxobox-like templatesstatic intiucn_status_confirmed_count = 0;// number of times that we confirmed the iucn status in taxobox-like templatesstatic intiucn_status_system_updated_count = 0;// number of times that we updated the iucn status system in taxobox-like templatesstatic stringtaxobox_blank = null;// gets blank taxobox as flagstatic boolstatus_ref_added = false;// set to true when |status_ref= createdstatic boolstatus_system_added = false;// set to true when |status_system createdstatic boolstatus_ref_updated = false;// set to true when |status_ref= updatedstatic boolstatus_ref_current = false;// set to true when |status_ref= less than 6 months oldstatic intduplicates_removed_count = 0;// number of duplicate status references removedstatic stringsc_ref_tag_begin = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?";// these for taxobox |status_ref= handlingstatic stringsc_ref_tag_end = @"""?\s*/\>";static stringref_def_begin = @"\<[Rr][Ee][Ff]\s*name\s*=\s*""?";// these for taxobox |status_ref= <ref name=... /> handling to locate the matching definitionstatic stringref_def_end = @"""?\s*\>[^\<]*\</[Rr][Ee][Ff]\>";static stringreflist_cleanup = @"(\{\{\s*[Rr]eflist[^\}]*\|\s*refs\s*=[^\}]*)\<\s*[Rr][Ee][Ff][^\>]*/\>";static stringhide_non_ref_tag_pattern = @"\<((?!/[Rr][Ee][Ff]|[Rr][Ee][Ff])[^\>]*)\>";static stringangle_open = "__4ng13_0__";static stringangle_close = "__4ng13_C__";static stringhide_non_ref_replace_val = angle_open + "$1" + angle_close;static intiucn_template_count = 0;// total number of cite IUCN templatesstatic intother_template_count = 0;// total number of cite journal/web templates//---------------------------< A P I >------------------------------------------------------------------------static stringapi_species_url = "http://apiv3.iucnredlist.org/api/v3/species/";// for fetching species data from the api by namestatic stringapi_species_id_url = api_species_url + "id/";// for fetching species data from the api by taxon id (for {{IUCN status}})static stringapi_id_url = api_species_url + "citation/id/";// for fetching citation data from the api using taxon idstatic stringapi_name_url = api_species_url + "citation/";// for fetching citation data from the api using binomialstatic stringiucn_api_token_file = @"Z:\Wikipedia\AWB\Monkbot_tasks\Monkbot_task_19_cite_iucn_update\iucn_api_token";// token required to be private; stored locally herestatic stringapi_token = null;// stored at iucn_api_token_file//---------------------------< C I T E   I U C N >------------------------------------------------------------static stringIS_CITE_IUCN = @"(?:[Cc]ite iucn|[Cc]ite IUCN)";static stringiucn_template_pattern = @"\{\{\s*" + IS_CITE_IUCN + @"[^\}]+\}\}";// basic cite IUCN template patternstatic stringiucn_title = @"\|\s*title\s*=([^\|\}]*)";// everything in cite IUCN |title= for api callsstatic string[] url_patterns = new string[]{@"https?://www\.iucnredlist\.org/details/(\d+)/\b(?:all|full)",@"https?://www\.iucnredlist\.org/details/full/(\d+)/\d+",@"https?://www\.iucnredlist\.org/details/(\d+)/\d+",@"https?://www\.iucnredlist\.org/details/(\d+)/?",@"https?://www\.iucnredlist\.org/details/summary/(\d+)",@"https?://www\.iucnredlist\.org/search/details\.php/(\d+)/(?:all|summ)",@"https?://oldredlist\.iucnredlist.org/details/(\d+)/\d+",};static stringref_param_empty = @"\|\s*ref\s*=\s*([\|\}])";static stringref_param_not_empty = @"\|\s*ref\s*=\s*([^\|\}]+)";//---------------------------< C I T E   J O U R N A L / W E B >----------------------------------------------static stringIS_CITE_OTHER = @"(?:[Cc]ite journal|[Cc]ite web)";// TODO: expand this to include more redirects?static stringother_template_pattern = @"\{\{\s*" + IS_CITE_OTHER + @"[^\}]+\}\}";// basic cite IUCN template pattern//---------------------------< N E W   C I T E   I U C N >----------------------------------------------------//// parse_pattern doesn't work for citations like this (from [[Cantleya]]) because of the 'extra' year ahead of// the binomial://Asian Regional Workshop (Conservation & Sustainable Management of Trees, Viet Nam, August 1996) 1998. Cantleya corniculata. The IUCN Red List of Threatened Species 1998: e.T33197A9760751. https://dx.doi.org/10.2305/IUCN.UK.1998.RLTS.T33197A9760751.en .Downloaded on 1 October 2021//// Haven't seen enough of these to attempt a second parse pattern////static stringcitation_from_api_pattern = @"\[\{""citation"":""([^""]*)""\}\]";static stringcitation_from_api_pattern = @"\[\{""citation"":""([^\}]*)""\}\]";static stringparse_pattern = @"(^\D+)(\d{4})\.(\D+)\. The IUCN Red List of Threatened Species (\d{4}): (e\.T\d+A(\d+))\.\D+(10\.2305\/IUCN\.UK\.[\d\-]+\.RLTS\.T\d+A(\d+)\S+)\D+(\d{1,2} [A-Za-z]+ \d{4})";static string[][] search_and_replaces ={new string[] {@"(.+?)\sssp\.\s+(.+?)\s(\([^\)]+\))$",@"''$1'' ssp. ''$2'' $3"},// binomen ssp. subspecies (zoology) with errata or amended textnew string[] {@"(.+?)\sssp\.\s+(.+)",@"''$1'' ssp. ''$2''"},// binomen ssp. subspecies (zoology)new string[] {@"(.+?)\ssubsp\.\s+(.+?)\s(\([^\)]+\))$",@"''$1'' subsp. ''$2'' $3"},// binomen subsp. subspecies (botany) with errata or amended textnew string[] {@"(.+?)\ssubsp\.\s+(.+)",@"''$1'' subsp. ''$2''"},// binomen subsp. subspecies (botany)new string[] {@"(.+?)\svar\.\s+(.+?)\s+(\([^\)]+\))$",@"''$1'' var. ''$2'' $3"},// binomen var. variety (botany) with errata or amended textnew string[] {@"(.+?)\svar\.\s+(.+)",@"''$1'' var. ''$2''"},// binomen var. variety (botany)new string[] {@"(.+?)\ssubvar\.\s+(.+?)\s(\([^\)]+\))$",@"''$1'' subvar. ''$2'' $3"},// binomen subvar. subvariety (botany) with errata or amended textnew string[] {@"(.+?)\ssubvar\.\s+(.+)",@"''$1'' subvar. ''$2''"},// binomen subvar. subvariety (botany)new string[] {@"(.+?)\s*(\([^\)]+\))$",@"''$1'' $2"}// binomen with errata or amended text};static stringerrata_text = @"\(errata version published in (\d{4})\)";static stringamended_text = @"\(amended version of (\d{4}) assessment\)";//---------------------------< T A X O B O X >----------------------------------------------------------------static stringHIDE_ALL_BUT_TAXOBOX = @"(?:[Tt]axobox\s*\||[Ss]peciesbox\s*\|)";// this to prevent confusion with {{Taxobox authority}} when hidingstatic stringIS_TAXOBOX = @"(?:[Tt]axobox|[Ss]peciesbox)";// for hiding all non-taxobox-like templatesstatic stringtaxobox_template_pattern = @"(\{\{\s*(" + IS_TAXOBOX + @"))[^\}]+(\}\})";// basic taxobox-like template pattern; TODO: {{subspeciesbox}}?static stringtaxobox_blank_pattern = @"\{\{\s*" + IS_TAXOBOX + @"\}\}";static stringtaxobox_new_stat_sys_ref_pattern = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]+?)(\s*)(\}\})";// used to create new |status=, |status_system=, and |status_ref= params in taxoboxstatic stringtaxobox_status_ref_pattern = @"(\|\s*status_ref\s*=\s*)(\<ref[^\>]*\>)[^\<]*(\</ref\>)";// used to replace |status_ref= param in taxoboxstatic stringtaxobox_status_ref_empty_pattern = @"(\|\s*status_ref\s*=[ \t]*)([\r\n]*[\|\}])";// used to add reference to |status_ref= param in taxoboxstatic stringtaxobox_status_sc_ref_pattern = @"(\|\s*status_ref\s*=\s*)(\<[Rr][Ee][Ff][^\>]+/\>)";// used to replace |status_ref= param in taxoboxstatic stringtaxobox_status_ref = null;// the 'new' value for |status_refstatic stringtaxobox_status_ref_open_tag = null;// it matching ref open tagstatic stringtaxobox_status_ref_sc_tag = null;// and its matching self-closed tagstatic stringstray_dot = @"(\|\s*status_ref\s*=\s*)\.";// delete stray dot; because I found one such (Astroblepus pholeter)static stringstray_splat = @"(\|\s*status_ref\s*=\s*)\*";// delete stray spat; because I found one such (Gray short-tailed bat)static stringstray_equal = @"(\|\s*status_ref\s*=\s*)=";// delete stray equal; because I found one such (Cyprinus hieni)static stringstray_nbsp = @"(\|\s*status_ref\s*=\s*)&nbsp;";// delete stray &nbsp; because I found one such (Euconocephalus remotus)static stringhtml_comment = @"(\|\s*status_ref\s*=[^\|\}]*)\<!\-\-[^\>]*\-\-\>";// and html commentsstatic stringunrecognized_species_name = null;// gets taxobox species name that IUCN doesn't recognize//---------------------------< T A X O B O X _ S T A T U S >--------------------------------------------------static stringIS_IUCN_STATUS = @"(\b(?:LC|LR/lc|NT|LR/nt|LR/cd|VU|EN|CR|PE|PEW|EW|EX|DD|NE)\b)";// also used with {{IUCN status}}static stringtaxobox_status_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=";static stringtaxobox_status_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=\s*([\|\}])";static stringtaxobox_status_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status\s*=\s*([^\|\}]+)";static stringtaxobox_status_pattern = @"(\|\s*status\s*=\s*)[^\|\}]*?(\s*[\|\}])";static stringstatus_from_api_pattern = @"""category"":""([^""]+)""";// for |status=//---------------------------< T A X O B O X _ S Y S T E M >--------------------------------------------------static stringIS_IUCN_SYSTEM = @"(\b(?:IUCN2.3|IUCN3.1)\b)";static stringtaxobox_system_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=";static stringtaxobox_system_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=\s*([\|\}])";static stringtaxobox_system_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_system\s*=\s*([^\|\}]+)";static stringtaxobox_system_pattern = @"(\|\s*status_system\s*=\s*)[^\|\}]*([^\|\}])";static stringstatus_system_from_api_pattern = @"""assessment_date"":""(\d+)";// for |status_system=//---------------------------< T A X O B O X _ S T A T U S _ R E F >------------------------------------------static stringtaxobox_status_ref_missing = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=";static stringtaxobox_status_ref_empty = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=\s*([\|\}])";static stringtaxobox_status_ref_value = @"(\{\{\s*" + IS_TAXOBOX + @"[^\}]*)\|\s*status_ref\s*=\s*([^\|\}]+)";static stringref_tag_named_pattern = @"(\<[Rr][Ee][Ff][^\>]*name\s*=\s*""?([^""\>]*)""?\s*\>)";static stringref_tag_named_sc_pattern = @"(\<[Rr][Ee][Ff][^\>]*name\s*=\s*""?([^""/]*)""?\s*/\s*\>)";static stringref_tag_unnamed_pattern = @"(\<[Rr][Ee][Ff]\>)";//---------------------------< T A X O B O X _ S P E C I E S _ N A M E >--------------------------------------static stringbinomial_pattern = @"\|\s*binomial\s*=\s*([^\|\}]*)";// taxoboxstatic stringtaxon_pattern = @"\|\s*taxon\s*=\s*([^\|\}]*)";// speciesboxstatic stringgenus_pattern = @"\|\s*genus\s*=\s*([^\|\}]*)";// these two combined to make binomial namestatic stringspecies_pattern = @"\|\s*species\s*=\s*([^\|\}]*)";static stringname_pattern = @"\|\s*name\s*=\s*([^\|\}]*)";// taxobox and speciesbox//---------------------------< D A T E S >--------------------------------------------------------------------static Dictionary<string, string> date_patterns = new Dictionary<string, string>(){{"dmy", @"\d{1,2}\s+([JFMASOND][a-z]+)\s+(\d{4})"},// dmy{"mdy", @"([JFMASOND][a-z]+)\s+\d{1,2}\s*,\s+(\d{4})"},// mdy{"ymd", @"(\d{4})\-(\d{2})\-\d{2}"}// ymd};static stringpreferred_status_ref_tag_name = @"iucn status (\d{1,2}\s+([JFMASOND][a-z]+)\s+(\d{4}))";static stringaccess_date = @"\|access\-?date=([^\|\}]+)";static Dictionary<string, int> months = new Dictionary<string, int>(){{"january", 1},// these for dmy and mdy{"february", 2},{"march", 3},{"april", 4},{"may", 5},{"june", 6},{"july", 7},{"august", 8},{"september", 9},{"october", 10},{"november", 11},{"december", 12},{"jan", 1},// these for dmy and mdy{"feb", 2},{"mar", 3},{"apr", 4},//{"may", 5},// same as whole month name; can't have two with the same key{"jun", 6},{"jul", 7},{"aug", 8},{"sep", 9},{"oct", 10},{"nov", 11},{"dec", 12},{"01", 1},// these for ymd{"02", 2},{"03", 3},{"04", 4},{"05", 5},{"06", 6},{"07", 7},{"08", 8},{"09", 9},{"10", 10},{"11", 11},{"12", 12},};//--------------------------- R E M O V E   D U P L I C A T E   S T A T U S   R E F >-------------------------static string[]symbols = new string[]{@"\{",@"\(",@"\|",@"\.",@"\-",@"\)",@"\}",};static stringref_open_tag_unnamed = @"\<[Rr][Ee][Ff]\>";static stringref_open_tag_named = @"\<[Rr][Ee][Ff][^\>]*\>";static stringref_close_tag = @"\</[Rr][Ee][Ff]>";static stringbib_open_ul = @"[\r\n]+\*\s*";static stringbib_close_ul = @"([\r\n]+)";//---------------------------< S P E C I E S _ N A M E _ C L E A N U P >--------------------------------------//// these things must be removed from binomial before calling the api with the binomial//static string[][] cleanup_patterns ={new string[] {ref_open_tag_named + @"[^\<]*" + ref_close_tag,""},// references; [[Lampadioteuthis]] caused api fetch exceptionnew string[] {@"\<[Rr][Ee][Ff][^\>]+/\>",""},// self-closed references; [[Sand cat]]new string[]{@"\<!\-\-[^\>]*\-\-\>",""},// html commentnew string[] {@"[\.;:]+$",""},// trailing punctuationnew string[] {"'''(.+)'''","$1"},// bold wiki markupnew string[] {"''(.+)''$","$1"},// italic wiki markupnew string[] {@"""",""},// double quote marksnew string[] {"†",""},// extinction markersnew string[] {@"\[\[",""},// opening wikilink markupnew string[] {@"\]\]",""},// closing wikilink markupnew string[] {@"\s*\([^\)]+\)",""},// disambiguationnew string[] {@"[\.;:]+$",""},// trailing punctuation (again)new string[] {@"\<nowiki/\>",""},// self-closed <nowiki/> tagnew string[] {@"\<nowiki\>",""},// opening <nowiki> tagnew string[] {@"\</nowiki\>",""},// closing </nowiki> tag};//----------------------------------------< P L A I N _ T E X T >---------------------------------------------//// for plaintext references wrapped in <ref>...</ref> tags or in unordered markup (bibliography); must have a// recognizable page identifier or doi or a url from which a taxon id can be extracted//static stringplain_text_ref_pattern = @"(\< *ref[^\>]*\>)([^\<]*)(\</ref>)";// <ref>anything</ref> ref tags and reference are capturedstatic stringplain_text_bib_pattern = @"([\r\n]+\*)([^\r\n]*iucnredlist\.org[^\r\n]*)([\r\n]+)"; // some sort of iucn ref in unordered liststatic stringplain_text_page_taxon_id = @"\be\.T(\d+)A\d+";// get taxon id from pagestatic stringplain_text_doi_taxon_id = @"\bRLTS\.T(\d+)A\d+";// get taxon id from doistatic stringplain_text_taxon_id_url = @"https?://(?:www|oldredlist)\.iucnredlist\.org/\S+?/(\d+)\S+";// get taxon id from url//---------------------------< I U C N   S T A T U S >--------------------------------------------------------static stringiucn_status_template_pattern = @"(\{\{\s*IUCN status[^\}]+\})";static stringiucn_status_lead = @"(\{\{\s*IUCN status\s*\|\s*)";static stringiucn_status_status = iucn_status_lead + IS_IUCN_STATUS;static stringiucn_status_id = @"(\{\{\s*IUCN status\s*\|[^\|]+\|\s*)(\d+)";// Monkbot_task_19_cite_iucn_update.cs