The short answer: data can be singular or plural. In some formal and technical contexts the plural form is preferred, but the singular form is increasingly common and is fully standard. In most contexts you can write these data or this data, data are or data is, and so on.
Data emerged in 1646 as the plural of the Latin datum, which according to the OED was the past participle of dare (‘give’) and meant ‘a thing given or granted; a thing known or assumed as a fact, and made the basis of reasoning or calculation; a fixed starting point for a series of measurements etc.’
Datum retains the general meaning of ‘a unit of information’, though it tends to appear mostly in academic and specialist disciplines such as philosophy, surveying, geodesy, topography, technical drawing, and cartography:
Several map datums were erroneous, which threw the hikers off-track.
‘The principal datum input to any search algorithm is a description of its search space.’ (Alan Hutchinson, Algorithmic learning)
‘[T]he paper seen and the seeing of it are only two names for one indivisible fact which, properly named, is the datum, the phenomenon, or the experience.’ (William James, The Meaning of Truth)
The meaning of the derived plural data has changed somewhat over the centuries. The OED definition from the late 19thC (‘Facts, esp. numerical facts, collected together for reference or information’) seems to testify to the broadening influence of the hard sciences. In the 20thC the rapidly expanding fields of information technology incorporated the word into a huge variety of computer-related compound nouns, such as database, data entry, data flow, data mining, data processing, data protection, and data stream.
Plural data is used in many scientific, technical, academic and other formal contexts, though different practices prevail in different places. Among the major news media, The Economist advises the plural usage; The Guardian, singular. The Times Style Guide expressly permits both. Here are some examples of plural usage found via the British National Corpus:
‘Our data are too uncertain to draw firm conclusions’ (Criminal Law Review)
‘Most of the data are new’ (Journal of Gastroenterology and Hepatology)
‘These data are then used to calculate bond enthalpies.’ (Michael Freemantle, Chemistry in Action)
In computing jargon, social sciences, and everyday use, data is often treated as an abstract mass noun, like information. It has the general meaning ‘mass of information’ and takes a singular verb, singular pronoun (it) and singular modifiers (e.g. this, a few, much):
‘On this map the data is recorded by county and not by region’ (Peter Hardy, A Right Approach to Economics?)
‘All this data is then written up as a technical report’ (Atkins & Atkins, An Introduction to Archaeology)
‘The retina codes and combines the data so that it can be fed into the 1 million fibres entering the optic nerve’ (Laszlo Solymar, Lectures on Electromagnetic Theory)
Few non-specialists who use the word data think of it as the plural of datum. Similarly, agenda has taken on a singular life of its own, distinct from the near-obsolete agendum, and has given rise to the standard plural agendas. Consider also media (from medium), criteria (criterion), graffiti (graffito), and stamina (stamen). All of these plurals have varying degrees of acceptance and acceptability. Agendas may be common and standard, but medias, datas and criterias are not – at least, not much and not yet.
A note of advice: try to be internally consistent, and be mindful of context. Sometimes one form is preferred: for example, most publishers have a house style to which your text must conform. Even in reputable publications, however, usage is mixed, and discrepancies can result in editorial mix-ups, as Merriam-Webster has shown. Readers who cling to the Latin origins of data may protest the singular form on principle, but this gripe is misguided. I should know: the singular form used to grate on me, but I wised up.