On September 15th, I will participate in The European Establishment Statistics Workshop 2021 (EESW21). During Session 4: Use of alternative data sources, I will present the work that I have been doing together with my colleagues Ger Snijkers and Sofie De Broe: A new Method for Automating Business Data Collection of Official Statistics through APIs, System-to-System data communication applied to John Deere data.
You can find the presentation abstract below:
A a new Method for Automating Business Data Collection of Official Statistics through APIs
José Gómez Pérez, Ger Snijkers, and Sofie De Broe
Statistics Netherlands, Methodology Department, Heerlen
Sample surveys are a primary data collection method. In the 20th century sample surveys have proven to be a cost-efficient method to produce accurate statistics, although they come with a high cost both for the National Statistical Institutes (NSIs) and businesses, who may experience high response burden. Nowadays in the information age, there are a lot of new digital data sources in smart industries, such as in precision farming. In some cases, these data sources allow for data communication with other computer systems without human intervention via Application Programming Interfaces (APIs). Based on these software interfaces, we developed a system-to-system (S2S) data collection methodology that reduces response burden by automating the business collection process. As far as we know, this method has not been used yet. Applied to the official crop yield survey, a pilot software prototype was developed based on this new methodology.
At the workshop, we will demonstrate how data capture and processing can be automated. Furthermore, the results offered by the prototype in a small-scale field test with real data from a few farmers will be presented. Data quality issues among others, will also be discussed.
By design, this method is easily extendable. In the short term, we plan to extend it to other agricultural surveys like the crop protection survey. Our methodology can be applied to arable farming in other countries because the main machine manufacturers (for instance, John Deere, CNH Industrial and Claas) provide APIs and have international presence in markets all over the world. Other sectors different from agriculture (for instance, transportation and economic activity tracking are fields where we can find a lot of APIs) can benefit from this methodology too. It can be applied by other National Statistical Institutes because the design is based on standard technologies.
You can find additional information about the workshop here: