Twin Cities: Synthetic household-level demographics

We’re excited to introduce Twin Cities: a synthetic dataset we’ve developed that includes household-level information while protecting privacy. Here we’ll explain the impetus for creating this dataset and walk through a dashboard we’ve created for you to explore it for yourself.

The need for high resolution demographic data

Every five years the Australian Bureau of Statistics (ABS) performs a census to collect data on everyone in Australia. Information is collected on dozens of factors such as age, languages spoken, occupation, and income, and this is used to help understand the demographics of the country and informs policy and funding decisions.

The ABS makes statistical summaries of census data conveniently available through QuickStats. You choose a statistical area (such as a suburb) and QuickStats then shows you a range of statistics about the people, families, and dwellings in that area.

Although it’s important that census data be made available so it can be put to good use, it’s also essential to protect the privacy of individuals. This means that information is always provided at the aggregate level, so by design you can’t access data about individuals or households. For example, using QuickStats you can see that households in the suburb of North Melbourne have a median weekly income of $1236, but you can’t see the income of a particular household.

However, there are many cases where individual- or household-level data is necessary. This is where our Twin Cities comes in.

How does Twin Cities help?

We’ve created a synthetic dataset that recreates the data provided by QuickStats but also allows much more fine-grained analysis.

Unlike the ABS QuickStats data, you can explore at the household and even the individual level rather than being limited to aggregated data for an area. Simulated individuals are placed in specific dwellings and have characteristics such as sex and occupation. You can also access family-level and household-level information. We are continually adding more to this dataset to make it even more useful.

Exploring Twin Cities

To introduce Twin Cities to the world, we’ve made a dashboard for you to explore. Read more about it below and have a play for yourself

On the first page you can explore the demographics of different areas within Melbourne. You can access statistics such as the number of dwellings, families, and persons in that area, as well as the average number of people per household and the percentages of males and females. There are also plots showing the numbers of different dwelling types, household types, family compositions, and age distributions for each region.

Explore summary statistics for the selected area

Where Twin Cities really goes beyond QuickStats is by showing you the locations and particulars for all households in the selected region. Remember, this is synthetic data so these are not real households or people. However, in aggregate they recapitulate the statistical properties of the selected area. The additional level of granularity means that we can also explore the data at the household and individual level.

In the second map, you can see at a glance the dwelling types and household sizes of each household in the area selected. Mousing over a household will show you more information about it.

Access information about individual households

You can visualise the distribution of different types of households on the second page of the dashboard. After selecting the filters you want to apply (household type, dwelling type, and household size), the map will then show you the locations of households that meet your criteria and a table with the number of households by region.

View the distribution of households meeting specific criteria

What would you like to do with Twin Cities?

To discuss how your organisation can use Twin Cities or to tell us what you think, get in touch .