Wikidata is a Wiki multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, are able to use under the CC0 public domain license. Wikidata is a wiki powered by the software MediaWiki, including its extension for semi-structured data, the Wikibase. As of early 2025, Wikidata had 1.65 billion item statements ().
Some examples of items and their QIDs are , , , , and .
Item labels do not need to be unique. For example, there are two items named "Elvis Presley": , which represents Elvis Presley, and , which represents his self-titled album. However, the combination of a label and its description must be unique. To avoid ambiguity, an item's QID is hence linked to this combination.
Statements may map a property to more than one value. For example, the "occupation" property for Marie Curie could be linked with the values "physicist" and "chemist", to reflect the fact that she engaged in both occupations.
Values may take on many types including other Wikidata items, strings, numbers, or media files. Properties prescribe what types of values they may be paired with. For example, the property may only be paired with values of type "URL".
Optionally, qualifiers can be used to refine the meaning of a statement by providing additional information. For example, a "population" statement could be modified with a qualifier such as "point in time (P585): 2011" (as its own key-value pair). Values in the statements may also be annotated with references, pointing to a source backing up the statement's content. As with statements, all qualifiers and references are property–value pairs.
Properties may also define more complex rules about their intended usage, termed constraints. For example, the property includes a "single value constraint", reflecting the reality that (typically) territories have only one capital city. Constraints are treated as testing alerts and hints, rather than inviolable rules.
Before a new property is created, it needs to undergo a discussion process.
The most used property is , which is used on more than item pages
In Wikidata, lexicographical entries have a different identifier from regular item entries. These entries are prefixed with the letter L, such as in the example entries for and . Lexicographical entries in Wikidata can contain statements, senses, and forms. The use of lexicographical entries in Wikidata allows for the documentation of word usage, the connection between words and items on Wikidata, word translations, and enables machine-readable lexicographical data.
In 2020, lexicographical entries on Wikidata exceeded 250,000. The language with the most lexicographical entries was Russian language, with a total of 101,137 lexemes, followed by English language with 38,122 lexemes. There are over 668 languages with lexicographical entries on Wikidata.
In January 2019, development started of a new extension for MediaWiki to enable storing ShEx in a separate namespace. Entity schemas are stored with different identifiers than those used for items, properties, and lexemes. Entity schemas are stored with an "E" identifier, such as for the entity schema of human data instances and for the entity schema of building data instances. This extension has since been installed on Wikidata and enables contributors to use ShEx for validating and describing Resource Description Framework data in items and lexemes. Any item or lexeme on Wikidata can be validated against an entity schema, and this makes it an important tool for quality assurance.
It includes data collections from other open projects including Freebase (database).
+ |
Wikidata was launched on 29 October 2012 and was the first new project of the Wikimedia Foundation since 2006. Wikidata () At this time, only the centralization of language links was available. This enabled items to be created and filled with basic information: a label – a name or title, aliases – alternative terms for the label, a description, and links to articles about the topic in all the various language editions of Wikipedia (interwikipedia links).
Historically, a Wikipedia article would include a list of interlanguage links (links to articles on the same topic in other editions of Wikipedia, if they existed). Wikidata was originally a self-contained repository of interlanguage links. Wikipedia language editions were still not able to access Wikidata, so they needed to continue to maintain their own lists of interlanguage links.
On 14 January 2013, the Hungarian Wikipedia became the first to enable the provision of interlanguage links via Wikidata. This functionality was extended to the Hebrew Wikipedia and Italian Wikipedias on 30 January, to the English Wikipedia on 13 February and to all other Wikipedias on 6 March. After no consensus was reached over a proposal to restrict the removal of language links from the English Wikipedia, they were automatically removed by Wikipedia bot. On 23 September 2013, interlanguage links went live on Wikimedia Commons.
The ability for the various language editions of Wikipedia to access data from Wikidata was rolled out progressively between 27 March and 25 April 2013. On 16 September 2015, Wikidata began allowing so-called arbitrary access, or access from a given article of a Wikipedia to the statements on Wikidata items not directly connected to it. For example, it became possible to read data about Germany from the Berlin article, which was not feasible before. On 27 April 2016, arbitrary access was activated on Wikimedia Commons.
According to a 2020 study, a large proportion of the data on Wikidata consists of entries imported en masse from other databases by , which helps to "break down the walls" of Information silo.
In 2021, Wikimedia Deutschland released the Query Builder, "a form-based query builder to allow people who don't know how to use SPARQL" to write a query.
In December 2014, Google announced that it would shut down Freebase in favor of Wikidata.
, Wikidata information was used in 58.4% of all English Wikipedia articles, mostly for external identifiers or coordinate locations. In aggregate, data from Wikidata is shown in 64% of all Wikipedias' pages, 93% of all Wikivoyage articles, 34% of all ', 32% of all ', and 27% of Wikimedia Commons.
, Wikidata's data was visualized by at least 20 other external tools and over 300 papers have been published about Wikidata.
A systematic literature review of the uses of Wikidata in research was carried out in 2019.
|
|