Assignment 2: "Liao Huanxing" Documentation in Wikidata

Structured Data and Semantic Infrastructure

Clara Holst (CH), AU737540, 202306022@post.au.dk

5th Semester, Cognitive Science BSc, Arts Faculty, Aarhus University

Jens Chr. Skous Vej 2, 8000 Aarhus, Denmark

Critical Data Studies, Elective | Curating Data Course

Characters: 13.279 ~ 5,5 pages

Introduction and Context

This report details my work for Wikidata as seen in Fig. 1, which centred on the biography of Liao Huanxing (1895-1964), a Chinese anti-colonial activist whose life and work were documented by the Dekoloniale project. The primary aim was to transform the narrative, qualitative information from this marginalized history into structured, machine-readable data on Wikidata. This process served a dual purpose: to enhance the digital visibility of a figure often omitted from mainstream historical narratives and to engage practically with the principles and challenges of Linked Open Data (LOD). By contributing to an open knowledge graph, the task aimed to connect Liao Huanxing’s story to a wider web of historical data, thereby supporting the findability, accessibility, and reusability of decolonial knowledge resources.

Figure 1

Fig. 1: Liao Huanxing’s new page in Wikidata, Source: Wikimedia/Wikidata, CC BY-SA 4.0.

Personal Role and Responsibilities

My primary contribution was the extensive research and data enrichment for the Wikidata item for Liao Huanxing (Q136450200). Figure 1 shows the newly created item page. Initially, this item did not exist on any Wikimedia project. I was responsible for a comprehensive review of his biography on the Dekoloniale website (Fig. 2) to identify key entities and properties for structuring.

Figure 2

Fig. 2: Liao Huanxing on the Dekoloniale project website, Source: Dekoloniale.de, CC BY 4.0.

Then I executed the creation and population of numerous statements on his Wikidata item. My contributions, a sample of which is shown in Figure 3, included adding specific details such as his residences, education at Wuhan University, language skills, political affiliations, and alternative names. A critical part of my role involved meticulously adding references for each statement, linking the data back to its source to ensure verifiability and honour the Dekoloniale project as a key knowledge producer.

Figure 3

Fig. 3: Overview of my own contributions to Liao Huanxing’s Wikidata item. Source: Wikimedia/Wikidata, CC BY-SA 4.0.

Methodological Approach

My methodological approach began with a close reading of the Dekoloniale biography of Liao Huanxing to extract factual elements that could be represented in Wikidata. This step involved identifying key pieces of information such as places of residence, education, political affiliations, language skills, and significant relationships. The aim was to determine which elements could be translated into structured statements while maintaining as much historical and contextual accuracy as possible (Figure 2).


Understanding the structure of Wikidata was essential for this process. Wikidata is organized around items, each representing a unique entity such as a person, place, or organization. Each item is identified by a Q-number. These items are described using statements, which consist of a property, identified by a P-number, and a value, which can either be another item or a literal value such as a date or string of text. Each statement can also include qualifiers, which provide additional contextual information such as the start or end dates of an event, and references, which link the statement back to a source. Together, these statements form subject-predicate-object triples, the foundation of Linked Open Data.


Figure 4 illustrates the process of adding a statement to Liao Huanxing’s Wikidata item, showing the interface used to input properties, values, qualifiers, and references. This structured approach allows information to be interlinked across the knowledge graph and connected to other relevant items, increasing its findability, accessibility, and usability.

Figure 4

Fig. 4: Adding a statement to Liao Huanxing’s Wikidata. Source: Wikimedia/Wikidata, CC BY-SA 4.0.

To maximize the interconnectivity of Liao Huanxing’s data, I prioritized linking to existing items in Wikidata whenever possible. For example, rather than entering the university he attended as a free-text value, I linked it to the existing item for Wuhan University (Q461313) using the property “educated at” (P69). This approach ensured that Liao Huanxing’s information became part of the broader semantic network, making it discoverable alongside other related historical figures and institutions.


Temporal information, such as the dates of political party membership or residence in a specific city, was added using qualifiers to provide historical context. The careful addition of these statements is shown in Figure 4, while the resulting network of interlinked data can be visualized in Figure 5.

Figure 5

Fig. 5: Visualization of Liao Huanxing’s data connections in the Wikidata Query Service (SPARQL query). Source: Wikimedia/Wikidata, CC BY-SA 4.0.

Overall, this methodological approach allowed the transformation of a rich, narrative biography into structured, linked data that could be queried, analyzed, and connected to other knowledge resources. By carefully considering the properties, qualifiers, and references used, I ensured that the structured data maintained the integrity of the original biography while enhancing its discoverability and usability within the broader Wikidata ecosystem. Figures 2, 4, and 5 together illustrate the full process, from extracting information from the Dekoloniale biography to modeling it in Wikidata and visualizing its connections in the knowledge graph. This approach demonstrates the potential of Linked Open Data to make marginalized historical figures visible and connected across multiple digital platforms.

Challenges and Considerations

The process was not without significant challenges. A primary difficulty was the inherent tension between narrative nuance and the rigid structure of a database. The biography conveyed the significance of Liao's activism through story, whereas Wikidata required its reduction to discrete, pre-defined properties. As seen in Figure 5, Wikidata's interface often showed warnings when it did not recognize new information, a process that often felt reductive, stripping away the contextual meaning of his political work.


Furthermore, ambiguities in the source text, such as imprecise dates, posed a problem. For example, stating a residence in Berlin without exact years required a decision on how to represent this temporally, balancing accuracy with the information available.


Ethically, I was highly conscious of the power dynamics inherent in categorization, as discussed by Bowker and Star (2000). Assigning labels like "activist" and selecting his ethnic group as "Han Chinese" are not neutral acts; they are interpretive choices that frame his identity in specific ways. Ensuring that every claim was rigorously sourced from the Dekoloniale project was my way of anchoring these categorizations in the original, community-driven narrative, thereby mitigating the risk of misrepresentation.

Critical Reflection

Reflecting on this process through the lens of our course readings reveals profound insights into curation as knowledge production. The notion that "raw data is an oxymoron" (Gitelman and Jackson, 2013, p. 2), was vividly demonstrated. The data I created on Wikidata was anything but raw; it was a product of my interpretation, the constraints of the platform's ontology, and the specific perspective of the Dekoloniale source.


This aligns with Kitchin's (2022) concept of "data assemblages," where data is shaped by a complex sociotechnical system: in this case, comprising the Dekoloniale editors, the Wikidata platform, its community norms, and myself as a curator. The work of Bowker and Star (2000) on the politics of classification was also ever-present.


By fitting Liao Huanxing’s life into Wikidata’s property schema, I participated in making his history visible within a particular standardized system, one that, as Ford and Illadis (2023) note, carries its own Western and structural biases. The outcome is a form of knowledge that is highly accessible and linkable but also flattened. While the process successfully makes a marginalized history more visible to algorithms and automated systems, it simultaneously risks decontextualizing it, separating the factual "what" from the narrative "why" that gives the facts their deeper meaning and political power.


Structured data should thus be understood as a supplement to, not a replacement for, rich narrative sources (Dekoloniale, Figure 2).

Conclusion

In conclusion, this exercise provided invaluable lessons about the potential and limitations of Linked Open Data for recuperating marginalized histories. I learned that making such histories visible requires more than just uploading facts; it demands a careful, ethical, and critically aware approach to data modeling. The power of LOD lies in its ability to connect disparate pieces of information, potentially placing a figure like Liao Huanxing on the same digital map as more widely known historical actors.


However, this comes at the cost of narrative richness. The key lesson is that structured data projects like this are most powerful when understood as supplements to, not replacements for, detailed narrative sources like the Dekoloniale biographies. They serve as vital indexes and entry points, guiding users to the richer, contextualized knowledge held within the original resources.


Ultimately, the work of digital curation is a continuous negotiation between the logic of the database and the complexity of human experience, a practice that is fundamentally about making conscious and responsible choices in the production of knowledge.

References

  • Bowker, G. C., & Star, S. L. (2000). Why classifications matter. In Sorting things out: Classification and its consequences (pp. 319-326). MIT Press.
  • boyd, d., & Crawford, K. (2012). Critical questions for Big Data. Information, Communication & Society, 15(5), 662-679.
  • Ford, H., & Illadis, A. (2023). Wikidata as semantic infrastructure: Knowledge representation, data labor, and truth in a more-than-technical project. Social Media + Society, 9(3).
  • Gitelman, L., & Jackson, V. (2013). Introduction. In "Raw Data" Is an Oxymoron (pp. 1-12). MIT Press.
  • Illadis, A., & Russo, F. (2016). Critical Data Studies: An introduction. Big Data & Society, 3(2).
  • Kitchin, R. (2022). The Data Revolution. SAGE Publications Ltd.
  • Wikidata: Liao Huanxing (Q136450200)
  • Dekoloniale Memory Culture in the City
  • Wikidata Query Service Visualization
  • Wikimedia Commons