… The government collects and produces information across sectors as diverse as scientific research and internal government functioning. Information is collected about economic indicators, health, product recalls, and government services such as Medicaid. Many databases are already made available in processed formats, but not in raw form.
Different data sets will have different qualitative privacy implications. Data about internal government functioning will tend to contain information about government employees, while other kinds of data will likely include information about private citizens and businesses. Each of these data types could contain personal information explicitly, or could be used to infer identity. For this reason, each data set will need its own specialized review before it can be published to data.gov.
This holds true for data sensitivity as well – certain kinds of data that have historically warranted higher privacy protections will require special care before they may be release through data.gov. While there is no firm consensus about what kinds of information should be considered “sensitive” in bulk, an array of existing statutes, self‐regulatory guidelines and policy proposals provide some basis for deciding what kinds of information about individuals should be granted some measure of special treatment. CDT has compiled a list
of such proposals,1 which may be helpful in determining the privacy implications of the release of particular data sets.