Database milestones
BioDeepTime is drawing information from both static and dynamically-updated databases. Due to the heterogeneity of the constituent data, BioDeepTime is compiled manually in semi-regular intervals. These data products are versioned semantically, and are deposited separately - including the code that was used to build the version. These static data products were chosen to enhance reproducibility and tracability of results based on the database.
Versioning
The planned versioning framework is:
- First-level: Large structural changes and major addition of sources.
- Second-level: Corrections and additions of new data. Smaller structural changes.
List of known issues
MARBEN
- Missing
species
andgenus
columns
General
- References need manual review especially bibtex. Character encoding is still a recurring problem. Some references entities represent multiple references.
Change log
BioDeepTime v1.0 [2023-07-12]
Added
- new field
samples.totalCount
that represents the sample size in the case of count data, rather than the target sampling effort - Three new
abundanceUnit
categories:"biomass cover"
,"biomass weight"
,"biomass volume"
- bibtex handles to the
bibtex
column of therefs
table. The candidate bibtex entries are in the refs.bib file.
Changed
- The word occurrence was systmatically replaced with
record
. Theoccurrences
table was renamed torecords
, its primary key fromoccID
torecordID
. - The
timeUnits
table was renamed totimeOriginalUnits
for better consistency. Consequently, the the fieldstimeUnitID
andtimeUnit
were rename totimeOriginalUnitID
andtimeOriginalUnit
, respectively. - The
ranks
table was renamed toanalyzedRanks
for better consistency. - Reference entries are forced into UTF-8 encoding
BioTime
- Omitted studies 39 and 217 due to potentially erroneous entries
- Added biomass data to where there were no
abundanceUnit
s earlier - Time series taxonomic/environment groups are added
- biomass values replaced abundance values in cases when count data was given as 0, but biomass was valid
Neotoma
- sample sum count is moved to
samples.totalCount
fromsamples.samplingEffort
. AccordinglysamplingEfforType
is consistently set toNA
. - Neotoma references were split, multiple refs per samples are now properly indicated
- Changed the taxon group to
Plants
Triton
- New version is used now - indicated to be released soon as Triton 2
- Fixed issues where all abundance values were relative abundances, even when count was indicated.
- The sample sum is now recorded in
samples.totalCount
and not insamplingEffort
. In case where samples reflect normalization for 1 gramm, the value 1 is now recorded insamplingEffort
with asamplingEffortType
of “g”.
SedTraps
- added missing
samplingEffortType
(all are m^2)
MARBEN
- added
samplingEffort
andsamplingEffortType
values fromprocessing
- total count of count-type data
- added missing reason:
"Community analysis"
- added reference to MARBEN
PBDB
- added total count of count-type data
Direct uploads
- coccolithophore data: moved total count from
samplingEffort
tototalCount
Removed
BioTime
- Studies 39 and 217 were removed due to quality reasons.
BioDeepTime v0.6 [2023-01-17]
Added
- The Geobiodiversity Database - the Fenxiang section
Changed
- moved the coccolith data to “Direct uploads”
Deleted
- occurrenceTypeID column and occurenceType table
Acknowledged
- The two Neptune time series that have mixed taxonomy represent true data.