Completion of a Citizen Science Loop with GREENGAGE

Citizen Science loop in thematic co-explorations within Citizen Observatories

Citizen Observatories (COs) empower individuals to actively participate in data collection and environmental monitoring to address local challenges. The GREENGAGE project, under the Horizon Europe framework, aims to enhance the efficacy and more widespread adoption of COs by providing a structured Citizen Science Loop methodology operationalized by a co-production process which is enabled by its GREEN Engine infrastructure. One core contribution brought forward by GREENGAGE is the “thematic co-exploration” concept. A thematic co-exploration, in the context of COs, refers to a collaborative approach where citizens actively participate alongside scientists and other stakeholders in exploring specific themes or topics related to environmental monitoring and observation. Through them, COs are made purposeful by leveraging the collective efforts of individuals, often non-scientists, to gather, share, and analyse environmental data, typically facilitated by digital tools and technologies.

This documentation describes the validation of the GREENGAGE co-creation process for thematic co-explorations, through a university campus based thematic co-exploration, which results in the execution of the following 6 steps of a Citizen Science loop, namely:

  1. Problem identification – recognizing research questions or societal challenges suitable for public engagement;
  2. Campaign design – co-selecting and co-designing participatory protocols, data collection methodologies, and toolkits for citizens’ engagement;
  3. Data crowdsourcing – enabling citizen scientists to gather good quality observations via digital applications, sensors, and surveys, through data crowdsourcing activities;
  4. Data analysis & interpretation – employing AI-driven tools for insight extraction and thus making humanly meaningful the data modelled;
  5. Feedback & collective learning – validating findings with humans and providing participants with actionable feedback; and
  6. Action & impact - informing policies, creating solutions, and refining methodologies for future CS campaigns exploring similar or complementary thematic co-explorations.

Citizen Science Loop in GREENGAGE

GREENGAGE platform to enable Citizen Observatories

These CS loop stages shown in figure above are aligned to the main phases established by GREENGAGE’s co-creation process for thematic co-explorations, which has been devised to organize, execute and exploit the results of CS campaigns. These main phases which compose the GREENGAGE co-creation process, supported by the Collaborative Environment (fully described at HOWTO Thematic co-exploration documentation page, are:

  • Phase 1 - preparing: fully aligned with the “problem identification” stage of a Citizen Science loop, comprises the following aspects:

    a) theme selection; b) pilot owners training; c) core team onboarding and d) core team training.

  • Phase 2 – designing: aligned with the “campaign design” stage of a CS loop, comprising:

    a) experiment specification; b) tools’ resources selection; c) tools resources customization and d) tools resources testing.
    - Phase 3 – experimenting: aligned with both the “data crowdsourcing” and “data analysis & interpretation” steps of a CS loop. It comprises the following activities: observers onboarding, observers training support, data collection, data combination, data analysis, data visualization and evaluation. - Phase 4 – sharing: aligned with “feedback & learning” and “action & impact” stages of CS loop, comprising the following tasks defined in the GREENGAGE thematic co-exploration process, namely storytelling, policy advocacy and sustainability.

Each phase is supported by GREENGAGE’s GREEN Engine infrastructure, named GREEN Engine, fully described at page Citizen Observer journey, which integrates various digital tools and knowledge assets to streamline the co-production process. The tools and knowledge assets created in GREENGAGE are categorized in the following areas of concern, where the names of the tools defined for each layer is indicated:

  1. Community and Co-production Process Management: In it, the emphasis is on building a strong, informed, and active community which collaborates through a co-production process by defining a hypothesis, research questions formulation or datasets selection, among others.
  2. Data Crowdsourcing and Capture: Based on the groundwork of the previous area of concern, this materialises into concrete data collection activities. It is characterized by active participation, leveraging technology to gather vital curated high quality environmental data.
  3. Data Analysis and Insights Generation: In this latter area of concern, the collected data is transformed into actionable insights. This is where the data, once transformed in actionable information, becomes a powerful tool for understanding and influencing environmental policy.

The purpose of this documentation page is to exemplify how the CS loop is enabled through the suite of tools and knowledge assets defined by GREENGAGE and shown in the below figure. Thus, next section describes how GREENGAGE validates the Citizen Science Loop through a real-world thematic co-exploration at the University of Deusto.

  1. Community and Co-production Process Management: Throughout this phase, the emphasis is on building a strong, informed, and active community which collaborates through a co-production process.
  2. Data Crowdsourcing and Capture: Based on the groundwork of the previous phase, e.g. definition of hypothesis, research questions formulation or datasets selection, among others, this phase materialises into concrete data collection activities. It is characterized by active participation, leveraging technology to gather vital environmental data.
  3. Data Analysis and Insights Generation: In this phase the collected data is transformed into actionable insights. This phase is where the data, once transformed in actionable information, becomes a powerful tool for understanding and influencing environmental policy.

GREEN Engine areas of concern

Citizen Science loop step-wised co-creation

This section showcases the process, tools and results obtained when applying the GREENGAGE CO-enabling approach to a real use-case, namely, “reflection on the suitability and air quality of important points of interest (POIs) within the campus of the University of Deusto in Bilbao, Spain”. The following subsections describe the different steps completed towards fulfilling the CS loop for this thematic co-exploration. Notice that to coordinate the execution of this co-creation process, a new process was defined in the Collaborative Environment as shown in the following figure.

GREENGAGE's Collaborative Environment

A core aspect of every thematic co-exploration is the collaborative (co-design & co-creation) activities that participants take part in. For this use case, the following co-design and co-creation sessions were realized: 1. Initial training, specification of the thematic co-exploration, team and co-creation process setup; 2. Crowdsourcing campaign and co-design of possible useful visualizations; 3. Collaborative reflection on the gathered data and analysis results; 4. Participation in Discourse channel, social media dissemination of public results and policy brief.

A. CS campaign specification

First thing in the organization of a thematic co-exploration, within a Citizen Observatory, is to decide what is the socio-economic and/or environmental challenge that wants to be addressed. There are different aspects that need to be decided at this stage. A useful knowledge asset or CO enabler, as they are called in GREENGAGE, is the thematic co-exploration specification template which is a Word template which guides the organizers of a thematic co-exploration through the following questions:

  1. WHY – reason why this Citizen Observatory’ thematic co-exploration is needed;
  2. WHO – stakeholders involved and affected who need to be recruited for the co-exploration to take place and for the outcomes to be disseminated to;
  3. WHAT – actual endeavours/activities of the Citizen Observatory’s thematic co-exploration towards validating a defined hypothesis and populate a given set of metrics;
  4. WHEN – planning of activities (resources&time) when undertaking the Citizen Observatory’s thematic co-exploration, e.g. crowd-sourcing and data analysis sessions needed;
  5. WHERE – geographical area where Citizen Observatory’s thematic co-exploration will take place, i.e. area to cover and specific points and frequency of measurements which are needed to ensure valuable crowdsourced data;
  6. WHICH – materials and resources, i.e. actual materials, devices and tools needed to execute the Citizen Observatory’s thematic co-exploration, coming either from GREENGAGE or other publicly available tools and assets;
  7. HOW – specification of data analysis processes/workflows to be able to capture, analyse and generate indicators and visualizations sought in Citizen Observatory’s thematic co-exploration. In this stage the needed visualizations and possible storytelling approaches to be eventually adopted are co-specified too.

As result, a thematic co-exploration specification for University of Deusto's campus has been produced, where the following decisions were taken:

  • Set up the observers’ team, in this case 10 researchers associated to the MORElab research group were recruited and committed to be engaged in the whole co-creation process.
  • Definition of different places (POIs) at the University’s campus (4 were selected) where to gather air quality and place suitability perceptions. These POIs are in the surroundings of the Faculty of Engineering’s building at University of Deusto.
  • Definition of metrics to estimate air quality and campus space suitability. For that, two new metrics were co-defined by the team of observers in a meeting, namely, a Perception of Air Quality Index (PAQI) was made up, where on a scale from 1 to 5, people have to indicate their perception from very clean (no noticeable pollution effects) to highly polluted (major health concerns, unlivable conditions), and, the Public Space Suitability Index (PSSI) was also made up, where again in a 1 to 5 scale, volunteers have to express their perception regarding aaccessibility & connectivity (20%), safety & security (15%), environmental quality (15%), functionality & comfort (20%), sociability & inclusivity (15%) or maintenance & management (15%) aspects. Again, 5 ranges of suitability were defined ranging from excellent suitability (average answer values >4) to poor suitability (<1). Apart from perceptions of air quality, those volunteers who counted with an Atmotube Pro device also collected air quality data through it.
  • Definition of tasks to be performed at each selected POI. Firstly, gather a visual perception (photo) and secondly, complete a short 3 question survey for volunteers to express their perception about place suitability and air quality, plus having the chance to leave some feedback about the visited POI. Notice that for the sake of simplicity, one single question per metric was provided to feed the above-mentioned metrics. In a more extensive and thorough real-world thematic co-exploration, one question for each of the factors feeding the devised PSSI should have been included, for instance. Next, the questions designed for the questionnaires to be completed by volunteers at each of the selected POIs are listed:

  • Do you consider that the Air Quality in this spot is? a) very bad; b) bad; c) normal; d) good; e) very good

  • Do you consider that this POI is suited to facilitate campus life and activities? a) not at all; b) not significantly; c) it is OK; d) good enough; e) very good
  • Provide feedback about this POI in terms of its suitability and/or air quality perception. Any suggestions, improvements that you would add?

B. Design of the crowdsourcing campaign

Once the specification of the thematic co-exploration is ready, the next critical stage is to co-design the CS crowdsourcing campaign – based on the hypothesis, objectives and data gaps previously identified in thematic co-exploration specification for University of Deusto's campus. In GREENGAGE, the following parameters must be completed in order to set up a CS crowdsourcing campaign:

  1. A new instance of an observatory entity is defined, specifying a name for it and the geographical area it covers;
  2. A set of POIs within the defined area are defined, for each POI the following fields are defined: type (e.g. culture, nature and so on), description, latitude & longitude, and optional photo;
  3. In each POI several tasks can be defined, usually in a campaign the same tasks are required at every measurement point, for each task the following fields are defined: POI associated, topic (e.g. air quality, safety and so on), type (survey, photo, walk and so on), title, description and geocoordinates
  4. For tasks of type survey, a survey has be created or an already existing one linked, providing the following fields: title and a set of questions, where for each question a title, type (single choice, multiple choice, text request), and options as pairs of id and values must be defined. The next figure exemplifies the crowdsourcing campaign defined for the thematic co-exploration at Deusto’s campus.

GREENGAGE crowdsourcing campaign specification

Next, it is shown the dashboard that has been defined as back-end of the GREENGAGE app and which can be used to configure CS crowdsourcing campaigns in GREENGAGE. For more details on how to use this interface check GREENGAGE back-end's crowdsourcing campaign configuration dashboard's documentation.

GREENGAGE crowdsourcing campaign specification

C. Collect, extract, transform and load campaign data

On Friday 14th March 2025, from 11:30am to 12:30pm CET, a crowdsourcing campaign was executed, where 10 people took part. Notice that volunteers before starting the crowdsourcing campaign they were requested to complete the following actions:

  1. They had to log into https://me.greengage-project.eu/ and complete there a sociodemographic form as the one shown in the figure below. Notice that this form requires that each participant specifies her role in the thematic co-exploration, gender, age range, work status, education level and so on. Besides and, very importantly, volunteers must accept a consent form at the bottomo of this page by means of which GREENGAGE is allowed to process their supplied data and aggregate with other participant's sociodemographic data, still preserving the privacy of participants at all time. Only, after they have completed this form, users are allowed to log in into the GREENGAGE app devised to capture data. GREENGAGE participant's identify manager
  2. They had to complete a PRE Impact evaluation questionnaire as the one shown below. Its PDF printout showcases how volunteers in a CS campaign held within GREENGAGE are questionned about environmental, political, scientific and social impact perception before they take part in a thematic co-exploration. GREENGAGE's volunteers' PRE Impact Evaluation questionnaire
  3. They had to download and install the GREENGAGE app from either Android's Google play store or Apple's app store in their smartphones. Notice that the GREENGAGE app also allows users to register as done in step 1, if they do not have credentials, so that they can access into the app. After, these preparation activities, volunteers were ready to launch the GREENGAGE app, as shown below, and use it to collect data. Such campaign was executed by 10 volunteers on Friday 14th March 2025, from 11:30am to 12:30pm CET.
Check into POI Answer questionnaire Submit responses
POI3-1 POI3-2 POI3-3

Right after concluding the crowdsourcing campaign, volunteers were also requested to complete a POST Impact evaluation questionnaire as the one shown below. Its PDF printout showcases how volunteers in a CS campaign held within GREENGAGE are questionned about environmental, political, scientific and social impact perception after they take part in a thematic co-exploration.

GREENGAGE's volunteers' POST Impact Evaluation questionnaire

The crowdsourced campaign data was collected by applying an ETL process. Whilst, the PRE and POST impact and satisfaction questionnaires are hosted in Google Forms, the GREENGAGE app’s collected data is hosted at Apollo Server which exhibits a GraphQL API. This API allows clients (such as mobile apps or web frontends) to efficiently query and interact with campaign data. This interface was used to retrieve all data crowdsourced associated to the POIs and spots generated in the above described thematic co-exploration’s crowdsourcing campaign.

Through the Apollo Server front-end, shown below, details in JSON about tasks performed in the crowdsourcing campaign are retrieved by means of a GraphQL query. Notice that GraphQL is a modern API query language and runtime that allows clients to request precisely the data they need in a single request, unlike RESTful APIs which rely on multiple endpoints and fixed data structures, often leading to over-fetching or under-fetching of data.

GREENGAGE's Apollo Server front-end for its back-end data model

The ETL process corresponding to the crowdsourcing campaign was implemented as a Python script utilizing asynchronous programming patterns to efficiently extract, transform, and load GREENGAGE app’s data. For data extraction, the script interfaces with GREENGAGE app’s GraphQL API endpoint to retrieve mission data, including various mission types (e.g. SURVEY, WALK or DATASET). Additional observatory-related data is collected to provide geographical context for each mission during transform stage of the process. Finally, the script also interfaces with a Keycloak-based authentication service customized for GREENGAGE, which allows extracting participants’ socioeconomic user data. Internally, this extension of KeyCloak has a PostgreSQL database which can be queried through SQL. Such service enables anonymization while preserving demographic information. Through GREENGAGE's identify manager's interface, check its documentation 10 volunteers completed their sociodemographic details and were granted credential to access to GREEN Engine's tools, including the GREENGAGE app.

During transformation, mission data is processed according to type-specific rules. For example, a SURVEY type task, requires additional API calls to retrieve associated quantitative survey values, or GEOTRACKING type mission requires additional API call to retrieve associated GeoJSON object. The transformation phase maps tasks to observatory information, enriches records with anonymized user demographic data, and converts all data to a standardized CSV format with appropriate fields for analysis. This step is crucial because Apache Druid, the real-time analytics database used in the loading phase, requires data to be ingested in a structured format. The deployment of Druid performed for GREENGAGE, check its documentation, stores data internally in a columnar format, which optimizes it for fast aggregations and queries.

Finally, the load phase utilizes Apache Druid's ingestion API to load the transformed data. The script generates a comprehensive ingestion specification defining data types, dimensions, and granularity settings to optimize subsequent analytical queries. Once ingested, the data becomes immediately queryable through Druid's SQL API, which enables seamless integration with visualization platforms like Apache Superset and other analytics tools.

Notice that 3 ETL processes were set up to extract, transform and load data from:

  1. Photos gathered through task 2 (see crowdsourcing campaign's spec
  2. Survey answers associated to the different users and POIs where surveys were responded through task 1 (see crowdsourcing campaign's spec
  3. Socio demographic data completed by volunteers when they signed up to take part in the observatory, by means of the https://me.greengage-project.eu page shown at identify manager's interface. As result of these ETL processes, data was stored in Apache Druid infrastructure (see Apache Druid's interface), which is the storage solution chosen within GREEN Engine.

GREENGAGE Apache Druid deployment

As result of such campaign the following number of measurements were gathered:

  • 10 people completed a sociodemographic questionnaire which granted them access GREENGAGE app, after a consent form was signed.
  • 10 people also completed the PRE Impact evaluation questionnaire, as part of the ex-ante and ex-post evaluation approach adopted by the project.
  • 10 people completed the POST Impact evaluation questionnaire
  • 21 photos were gathered at spots defined near the 4 POIs visited by the 10 volunteers, where potential issues were identified.
  • 90 answers to the 3-question survey associated to each of the 4 POIs defined in the campaign were gathered.
  • 180 air quality measurements were gathered by the 4 Atmotube devices carried by volunteers (39 PM2.5 measurements).

D. Analyse data throuh visualisations & reflection

Once data had been collected into Druid, helped by Apache Superset tool, a range of visualizations were created at GREENGAGE's deployment, check its documentation. A superset dashboard was created with Superset analyse survey answers at each POI. The figure shows analysis of answers to the question “How do you rate air quality at POI4?”. In the top left-hand side in a pie chart we notice that volunteers’ perception was good or very good in 70% of the cases, i.e. for 7 out of the 10 volunteers that took part. The table below the pie chart shows the answers’ distribution out of the 4 categories, there are 5 (no answer for “bad” was received). Interestingly, the PM2.5 concentration gathered by the 4 volunteers that also carried out an Atmotube device whilst they went around with the GREENGAGE app is shown at the bottom right side. During the crowdsourcing campaign’s time, the concentration of PM2.5 was in the range 1 to 7.8. Most studies indicate PM2.5 at or below 12 μg/m3 is considered healthy with little to no risk from exposure. If the level goes to or above 35 μg/m3 during a 24-hour period, the air is considered unhealthy and can cause issues for people with existing breathing issues such as asthma. Hence, users’ perceptions regarding air quality match what the Atmobube devices reflected in their measurements.

GREENGAGE Apache Superset's visualization of survey results for a POI

Additionally, results from all questions and all POIs were aggregated in other charts. For instance, figure about survey results in the whole campus depicts that in above 55% of the cases, volunteers considered that the air quality at the all the POIs, i.e. 4 points, selected was good or very good.

GREENGAGE Apache Superset's visualization of survey results for a POI

E. Evaluation of the thematic co-exploration experience and results

Analysing the data visualized in the Superset has allowed us to draw some few conclusions. There has been alignment between the perceptions regarding air quality and the actual measurements obtained through more reliable devices as Atmotube. This fact was reflected per POI, see figure for findings for POI4 and figure for cross-campus survey results which aggregates the air quality perceptions in all POIs. Similarly, info about the suitability of the different places was gathered and analysed at each POI and for the whole campus, taking the average value of the PSSI in the 4 points that represent the campus. Finally, some feedback was gathered from those visiting the POIs, who mainly took some pictures in spots, i.e. locations, close to the POIs where they found something remarkable regarding the suitability or the air quality of the place. The following figure shows an HTML visualization generated with all the spots collected. Source code for this HTML is available.

Photos about spots of interest in Deusto's campus

Analysing the spots and the comments received in question 3 of survey 1 (see figure for findings for POI4), it was found that although generally the status of the campus is considered quite adequate, issues related to the fact that pedestrians and cars often share common spaces, it might be troublesome.

The following figure summarizes the sociodemographic profile of participants in the campaign, where 3 women and 7 men, owning 5 iPhones and 6 Android devices, half of them not associated to GREENGAGE took part. Notice that the predominant age range was 25-50 years, and they were digitally literate participants with master of PhD studies.

Sociodemographic details participants in thematic co-exploration

F. Feedback and action: Social media dissemination, policymaking, replication

To close the CS loop and contribute towards the positive transformation of Deusto’s campus in Bilbao, the following actions were taken:

  1. Discourse channel was created through which participants could communicate with each other. Results were announced through this Discourse channel, see figure below, and participants had the opportunity to comment on the generated visualizations and in the interpretations performed by all of them.
  2. X and LinkedIn social media channels were used to summarize the thematic co-exploration completed main conclusions.
  3. The policy brief template CO enabler was used to create and send to the vice-chancellor in the university of Deusto responsible for the campus refurbishment.
  4. Impact analysis about the executed campaign in terms of social, political and citizen science aspects was performed, following the ACTION project’s impact evaluation approach.
  5. The following datasets from this thematic co-exploration were uploaded into Zenodo community for GREENGAGE at the following entry):

  6. Anonymized sociodemographic data of participants in the thematic co-exploration organized at University of Deusto's campus in Bilbao

  7. Anonymized dataset with aggregated survey answers for the 4 POIs defined at the campus of University of Deusto
  8. Dataset with all snapshots (photos) captured at different spots in the University of Deusto's campus
  9. Atmotube Pro sensors measurements during from 4 different devices (users) captured between 11:30 - 13:30 CET local time
  10. PRE (before the participation in the CS campaign) impact evaluation questionnaire answers and report
  11. POST (after the participation in the CS campaign) impact evaluation questionnaire answers and report.
  12. Impact analysis of this thematic co-exploration was published [TO BE COMPLETED]

Sociodemographic details participants in thematic co-exploration

Conclusion

The tutorial has demonstrated how GREENGAGE infrastructure and suite of enablers configuring its Academy successfully bridge the gap between citizen participation and environmental decision-making. The completion of a whole Citizen Science Loop for the campus in Bilbao of the University of Deusto has been demonstrated. 10 participants with high digital acquaintances took part and made use of a wide range of tools and CO enablers, namely, Collaborative Environment, Discourse, Apollo server, Apache Druid and Apache Superset, together with enablers such as the “thematic co-exploration specification” or the “policy brief template”.

The main insight is that laypeople, after brief training, can follow the right protocol to gather good enough data to perform effective analysis or, at least, reflect on the results of data analysis’ visualizations. Still, the generation of visualization with Superset or the aggregation and curation of data through ETL processes is complex, requires programming and data analysis skills. Anyhow, interpretation of charts can be made accessible when some more expert people explain and discuss results with volunteers. Hence, continuous guidance and support for volunteers in CS observatories is critical to maintaining their interest across the whole thematic co-exploration’s duration. Besides, several feedback loops are necessary to keep the community tunned after each of the steps necessary to complete the CS loop, e.g. through a Discourse channel.