Journal Loader
Journal Loader
The Journal Loader is responsible for pulling data from PubMed Central (PMC) and the MEDLINE database, and making the appropriate updates to PASS.
Journal Loader Summary
The Journal Loader parses the PMC type A journal .csv
file, and/or the MEDLINE database .txt
file, and syncs with the repository by taking the following actions:
Adds journals if they do not already exist
Updates PMC method A participation if it differs from the corresponding resource in the repository.
Knowledge Needed / Skills Inventory
Development of the Journal Loader
Programming in Java
Basic understanding of the NLM journal loader
Running the Journal Loader
CLI commands
Technologies Utilized
Technical Deep Dive
Usage
Using Java system properties to launch the journal loader. Note: Replace the version number in the jar name with the specific version you are using.
Properties or Environment Variables
The following may be provided as system properties on the command line -Dprop-value
.
pass.core.url
The base URL for the pass-core REST API such as http://localhost:8080
pass.core.user
The pass-core backend user.
pass.core.password
The pass-core backend user password.
dryRun
Do not add or update resources in the repository, just give statistics of resources that would be added or updated
pmc
URL of the PMC "type A" journal .csv file, for example: https://www.ncbi.nlm.nih.gov/pmc/front-page/NIH_PA_journal_list.csv
medline
URL of the Medline journal file, for example: https://ftp.ncbi.nih.gov/pubmed/J_Medline.txt
LOG.*
Adjust the logging level of a particular component, e.g. LOG.org.eclipse.pass=WARN
Journal Loader Classes & Data Flow Overview
Data Flow
Initialization:
The
Main
class initializes the application and calls theBatchJournalFinder
andLoaderEngine
to start processing.
File Processing:
BatchJournalFinder
processes each file using the appropriate reader (MedlineReader
,NihTypeAReader
).The
load
method inBatchJournalFinder
initiates the process, collects files to be processed.
Data Loading:
Processed journal data is passed to
LoaderEngine
to be loaded into the target system.If a journal is not found then a new one will be created, otherwise it will update the journal.
Next Step / Institution Configuration
Journal loader is simple to configure. It will run on any system that can run Java applications and does not require external account setup. The two sources of data PMC Type A Journals and MEDLINE, do not require accounts to access the data. Similar to the NIHMS Loader and Grant Loader, the Journal Loader is run using AWS Batch and ECS.
Related Information
The following resources are the sources of the journal data that is loaded into pass:
Last updated