⚓ T250919 Add row/cell annotations to tabular data
- ️Wed Apr 22 2020
Add row/cell annotations to tabular data
Add row/cell annotations to tabular data
Tabular data on commons supports marking the source of the data, but only as a single source for the entire table, which does not match how data tends to be sourced in real life - often every row, column or cell of a table has different references. (Examples: per-row, per-column, per-cell.) Although much rarer, other types of annotations (plain textual notes, not sourcing, e.g. comments on methodology) are also sometimes used, and also can be on row, column or cell.
If tabular data is ever to serve as the primary way of storing data (as opposed to an intermediary for data taken from external sources or parsed from a wiki table), it needs to be able to support cell-level references and notes. Ideally column/row level too, although one could hack around that by providing an extra column or row where the cells contain nothing but sources.
Event Timeline
At this point additional annotations could only be done as extra columns. This would work for many cases, but probably not all. Could you give some examples of where columns won't be enough, and the dedicated annotation system would be required?
The task description does have examples. But also, how would the extra column be rendered by a dedicated client (either an internal tabular data editing UI or something external that tries to display or convert it)? There is no "source" datatype, so it would have to be defined as plaintext but actually contain wikitext, or something ugly like that. Sources should be their own type, with their own data structure.
Tgr: Agreed on the importance of this. There should be a canonical space for annotations, which could be a source or other (imagine the entire output of your favorite {{cite}} or {{footnote}} template) -- just as though every cell has a pre-generated footnote for all such details. There is already a common norm of providing an extra source column (or row, more rarely) -- which would still be useful where there is no need to repeat the same source information for each cell. Then the composite annotation for a cell could be (row note + column note + cell note). Cc @Thadguidry who had related thoughts on the pros and cons of storing this data as extra columns.
@Tgr i strongly oppose storing wiki markup inside columns because it makes the system far less portable and less stable. Wiki markup only works in the context of a specific wiki, and would render either differently or simply break -- templates, localization settings, and modules are wiki specific. I think the best course of action is to introduce a new column type, with the same structure as what wikidata has for references. This way the designer of each table can decide if 1) such source column is actually needed, and 2) if there should be just one or multiple source columns. What's more, I think the top-level source field should have the same format for consistency -- this makes source be either 1) per table, 2) per row, or 3) per column (just for the relevant columns).
Hi all - My personal opinion and those of a few other experts would be to embrace DRY (Don't Repeat Yourself - or others) and simply allow introduction of W3C standards for Tabular Data: Do you need me to comment on UI display approaches for CSV on the Web metadata? Here's some UI tools that were developed during the course of CSV on the Web development: https://github.com/ODInfoBiz CSV on the Web and related standards are actually really really good and well thought out. The standard is not the problem...its implementations using the standard and knowing about its richness afforded. It's actually a lightweight standard just as CSV itself is, which was the intent... to not get too much in the way of data publishers.
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct. · Wikimedia Foundation · Privacy Policy · Code of Conduct · Terms of Use · Disclaimer · CC-BY-SA · GPL · Credits