User:Chlod/Tools/JTWC Archiver

From Wikipedia, the free encyclopedia
JTWC Archiver
Original author(s)Chlod Alejandro
Developer(s)Contributors to the JTWC Archiver
Initial releaseSeptember 28, 2020; 3 years ago (2020-09-28)
Stable release
1.1.0 / October 6, 2020; 3 years ago (2020-10-06)
Repositoryhttps://github.com/ChlodAlejandro/jtwc-archiver
Written inJavaScript
EngineNode.js
TypeWeb scraper
License
Websitehttps://wiki.chlod.net/jtwc

The JTWC Archiver is a Node.js script that parses information from the Joint Typhoon Warning Center's RSS feed and archives whichever version is available. This way, warnings issued by the JTWC which are not immediately added to an article may still be used. The script is run once every 10 minutes and archives each new product. Warnings older than 6 months are deleted.

Since all of these files are pretty much analogous to the actual JTWC warnings (since they are archives straight from the source), you can use them as a basis for citation. Do note that warnings are deleted after 6 months, which means you'll have to put the archive link on the Wayback Machine or some other form of web archive. The JTWC Archiver only serves as a temporary location for warnings — not a permanent one.

Note: As of October 12, I'm deciding whether to hold the archives forever or not. If the size doesn't exceed 100 MB after a year, I'll probably just hold the warnings forever.
Update (27 April 2024): The result was keep forever.

Usage[edit]

You can browse the Tropical Cyclone Formation Alert and Tropical Cyclone Warning texts and images at https://wiki.chlod.net/jtwc.

To cite a JTWC warning, use the following:

Tropical Storm 15W (Kujira) Warning No. 10 (Report). United States Joint Typhoon Warning Center. 29 September 2020. Archived from the original on 29 September 2020. Retrieved 29 September 2020.
{{Cite web|last=|first=|date={{subst:date}}|title=Tropical Storm 15W (Kujira) Warning No. 10|url=https://www.metoc.navy.mil/jtwc/products/wp1520web.txt|url-status=dead|archive-url=https://wiki.chlod.net/jtwc/text/2020-09-29-0200-wp1520web.txt|archive-date={{subst:date}}|access-date={{subst:date}}|publisher=United States Joint Typhoon Warning Center}}

You may choose to drop the publisher entirely, like I do.

"Tropical Storm 15W (Kujira) Warning No. 10". United States Joint Typhoon Warning Center. 29 September 2020. Archived from the original on 29 September 2020. Retrieved 29 September 2020.
{{Cite web|last=|first=|date={{subst:date}}|title=Tropical Storm 15W (Kujira) Warning No. 10|url=https://www.metoc.navy.mil/jtwc/products/wp1520web.txt|url-status=dead|archive-url=https://wiki.chlod.net/jtwc/text/2020-09-29-0200-wp1520web.txt|archive-date={{subst:date}}|access-date={{subst:date}}|publisher=United States Joint Typhoon Warning Center}}

At the end of each year, most of the links are replaced with more resilient backups. Information for that can be found below.

Output[edit]

In this tree, the JTWC Archiver is run with the hypothetical wp4220 system, and was archived on September 28, 2020 at 00:00 UTC.

  • Working directory
    • jtwc.rss — The latest copy of the JTWC RSS. This is used to check if there were any updates to the JTWC bulletin.
    • jtwc_products — The folder containing all archived JTWC products.
      • gif — The folder containing graphics for TCFAs and TCWs.
        • 2020-09-28-0000-wp4220.gif — The TCFA/TCW graphic for wp4220 exactly at the moment of archiving.
        • latest-wp4220.gif — The latest TCFA/TCW graphic for wp4220. This file is overwritten when a new graphic is issued.
      • jmv — The folder containing JMV 3.0 data.
        • 2020-09-28-0000-wp4220.tcw — JMV 3.0 data for wp4220 exactly at the moment of archiving.
        • latest-wp4220.tcw — The latest JMV 3.0 data for wp4220. This file is overwritten when new data is available.
      • prog — The folder containing tropical cyclone prognostic reasonings.
        • 2020-09-28-0000-wp4220prog.txt — The prognostic reasoning for wp4220 exactly at the moment of archiving.
        • latest-wp4220prog.txt — The latest prognostic reasoning for wp4220. This file is overwritten when a new version is issued.
      • text — The folder containing the TCFA and TCW warnings.
        • 2020-09-28-0000-abioweb.txt — The advisories for the ABIO sector (the Indian Ocean). Since this text file is always provided when a system in that sector is active, it will be archived as well.
        • 2020-09-28-0000-abpwweb.txt — The advisories for the ABPW sector (the Pacific Ocean). Since this text file is always provided when a system in that sector is active, it will be archived as well.
        • 2020-09-28-0000-wp4220web.txt — The TCFA/TCW text for wp4220 exactly at the moment of archiving.
        • latest-abioweb.txt — The latest ABIO advisory. This file is overwritten when a new advisory is issued.
        • latest-abpwweb.txt — The latest ABPW advisory. This file is overwritten when a new advisory is issued.
        • latest-wp4220.txt — The latest TCFA/TCW text for wp4220. This file is overwritten when a new warning is issued.

url-status[edit]

The url-status parameter in {{Cite web}} should always be either dead (which emphasizes the archived version over the original) or unfit (which hides the original entirely). This is because the links that lead to a specific warning are time-sensitive and will change, and that the archived version is preferred over the live version (which may already be a new system entirely.)

Periodic archiving[edit]

At the end of each year, all collected products will be uploaded to the Internet Archive for permanent storage. This is primarily due to three reasons:

  1. The Internet Archive is a more generally-known (and thus, generally-trusted) website, which helps alleviate some concerns regarding SELFPUB or privacy-related concerns (even though the archiver website is just an Apache directory browser).
  2. Though the total file sizes for a year of operation are not significant, there might be a time where I am forced to clear out space from the server. If that happens, those files would then be permanently Not Found, and may not be accessible anymore.
  3. The JTWC Archiver, despite being an archive, was initially designed to be a temporary gathering place for bulletins while parts of an article have not yet been written. Since bulletins are changed rapidly, while articles may not be, there are instances where an old version is no longer available due to it being overwritten. Though I can theoretically hold all bulletins up until, say 2030, I'd like to form a sense of assurance that the bulletins would be stored "forever" by a capable and established archiving service.

Though those bulletins will be archived, it does not mean that they will be removed from the website immediately. They're put on the Internet Archive for that exact reason: archiving. It's meant to be a backup in case things go south on my end.

A list of archives is provided below.

  • 2020 (ATNIPCWP) – Archived on January 12, 2021  – TXTGIFPROG
    Note: The archive project started on September 28, 2020, and thus, this year only has products beginning September 28, 2020.
  • 2021 (ATNIPCWP) – Archived on January 12, 2021  – TXTGIFPROGPROG
    Note: JMV 3.0 data archiving started on May 16, 2021, and thus, this year has JMV 3.0 data only for products beginning May 16, 2021.