LSE Impact of Social Sciences blog: What does Big Data mean to public affairs research? Understanding the methodological and analytical challenges

The following text was originally prepared for LSE’s Impact of Social Sciences Blog and reposted here.

===

The term ‘Big Data’ is often misunderstood or poorly defined, especially in the public sector. Ines Mergel, R. Karl Rethemeyer, and Kimberley R. Isett provide a definition that adequately encompasses the scale, collection processes, and sources of Big Data. However, while recognising its immense potential it is also important to consider the limitations when using Big Data as a policymaking tool. Using this data for purposes not previously envisioned can be problematic, researchers may encounter ethical issues, and certain demographics are often not captured or represented.

In the public sector, the term ‘Big Data’ is often misused, misunderstood, and poorly defined. Public sector practitioners and researchers frequently use the term to refer to large data sets that were administratively collected by a government agency. Though these data sets are usually quite large and can be used for predictive analytics, administrative data does not include the oceans of information that is created by private citizens through their interactions with each other online (such as social media or business transaction data) or through sensors in buildings, cars, and streets. Moreover, when public sector researchers and practitioners do consider broader definitions of Big Data they often overlook key political, ethical, and methodological complexities that may bias the insights gleaned from ‘going Big’. In our recent paper we seek to provide a clearer definition that is current and conversant with how other fields define Big Data, before turning to fundamental issues that public sector practitioners and researchers must keep in mind when using Big Data.

Defining Big Data for the public sector

Public affairs research and practice has long profited from dialogue with allied disciplines like management and political science and has more recently incorporated insights from computational and information science. Drawing on all of these fields we define Big Data as:

“High volume data that frequently combines highly structured administrative data actively collected by public sector organizations with continuously and automatically collected structured and unstructured real-time data that are often passively created by public and private entities through their internet.”

This definition encompasses the scale of newly emerging data sets (many observations with many variables) while also addressing data collection processes (continuous and automatic), the form of the data collected (structured and unstructured), and the sources of such data (public and private). The definition also suggests the ‘granularity’ of the data (more variables describing more discrete characteristics of persons, places, events, interactions, and so forth), and the lag between collection and readiness for analysis (ever shorter).

Methodological and analytical challenges

Defined thus Big Data promises access to vast amounts of real-time information from public and private sources that should allow insights into behavioral preferences, policy options, and methods for public service improvement. In the private sector, marketing preferences can be aligned with customer insights gleaned from Big Data. In the public sector however, government agencies are less responsive and agile in their real-time interactions by design – instead using time for deliberation to respond to broader public goods. The responsiveness Big Data promises is a virtue in the private sector but could be a vice in the public.

Moreover, we raise several important concerns with respect to relying on Big Data as a decision and policymaking tool. While in the abstract Big Data is comprehensive and complete, in practice today’s version of Big Data has several features that should give public sector practitioners and scholars pause. First, most of what we think of as Big Data is really ‘digital exhaust’ – that is, data collected for purposes other than public sector operations or research. Data sets that might be publicly available from social networking sites such as Facebook or Twitter were designed for purely technical reasons. The degree to which this data lines up conceptually and operationally with public sector questions is purely coincidental. Use of digital exhaust for purposes not previously envisioned can go awry. A good example is Google’s attempt to predict the flu based on search terms.

Second, we believe there are ethical issues that may arise when researchers use data that was created as a byproduct of citizens’ interactions with each other or with a government social media account. Citizens are not able to understand or control how their data is used and have not given consent for storage and re-use of their data. We believe that research institutions need to examine their institutional review board processes to help researchers and their subjects understand important privacy issues that may arise. Too often it is possible to infer individual-level insights about private citizens from a combination of data points and thus predict their behaviors or choices.

Lastly, Big Data can only represent those that spend some part of their life online. Yet we know that certain segments of society opt in to life online (by using social media or network-connected devices), opt out (either knowingly or passively), or lack the resources to participate at all. The demography of the internet matters. For instance, researchers tend to use Twitter data because its API allows data collection for research purposes, but many forget that Twitter users are not representative of the overall population. Instead, as a recent Pew Social Media 2016 update shows, only 24% of all online adults use Twitter. Internet participation generally is biased in terms of age, educational attainment, and income – all of which correlate with gender, race, and ethnicity. We believe therefore that predictive insights are potentially biased toward certain parts of the population, making generalisations highly problematic at this time.

In summary, we see the immense potential of Big Data use in the public sector, but we also believe that it is context-specific and must be meaningfully combined with administratively collected data and purpose-built ‘small data’ to have value in improving public programmes. Increasingly, public managers must know how to collect, manage, and analyse Big Data, but they must also be fully conversant with the limitations and potential for misuse.

This blog post is based on the authors’ article, ‘Big Data in Public Affairs’, published in Public Administration Review (DOI: 10.1111/puar.12625).

Note: This article gives the views of the author, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.

About the authors

mergelInes Mergel is full professor of public administration at the University of Konstanz’s Department of Politics and Public Administration. Mergel focuses her research and teaching activities on topics such as digital transformation and adoption of new technologies in the public sector. Her ORCID id is 0000-0003-0285-4758 and she may be contacted at ines.mergel@uni-konstanz.de.

rethemeyerKarl Rethemeyer is Interim Dean of the Rockefeller College of Public Affairs & Policy, University at Albany, State University of New York. Rethemeyer’s primary research interest is in social networks and their impact on political and policy processes. His ORCID iD is 0000-0002-5673-8026 and he may be contacted at kretheme@albany.edu.

isett_portraitKimberley R. Isett is Associate Professor of Public Policy at the Georgia institute of Technology. Her research is centred on the organisation and financing of government services, particularly in health.  Her ORCID id is 0000-0002-7584-0181 and she may be contacted at isett@gatech.edu.

CfP Special Issue Agile Government and Adaptive Governance in GIQ

Special Issue on Agile Government and Adaptive Governance in the Public Sector

Governments around the world have to respond faster to citizen needs, like the expectation of 24/7 availability and personalized access to government services generated by the so-called ‘Facebook generation’. Seamless user-centric experiences on social networking suites, such as Weibo or Twitter, as well as online marketplaces such as Amazon, increase the demand for similar experiences with government services. In addition, industry trends, such as Big Data, predictive analytics methods, and Smart City approaches drive the need to create internal capacity and skill sets to evaluate, respond to, and implement new technologies and internal processes.

The previous new public management era has left many government organizations with a reduced skill set and limited capacity to upgrade their IT infrastructure. As a result, their capability to innovate has been deteriorated due to increasing incentives to outsource especially IT development and services. The HealthCare.gov rollout disaster in the U.S. was a clear indication that the role of information management experts in government is oftentimes limited to contract management tasks, such as planning and oversight. One response from government organizations is to create internal innovation labs, organize hackathons, hire Chief Innovation Officers, or try to recruit industry expertise into government.

We observe first organizational, structural, managerial, procedural, and technological changes to address the changing internal and external environments of government organizations. As an example, the UK and US governments have adopted new organizational structures in form digital services teams that are able to respond faster to ad hoc needs of their internal government clients. They have adopted an agile government approach designing software in a more information- and user-centric way that is standard in the IT industry. Once software is developed, it is shared widely across all levels of government and no longer siloed in one department. In addition, governments need to adapt to changes in their internal and external environments and create systems that allow them to scan trends and identify developments, predict their potential impact on the organization, and quickly learn and implement responses (Gong & Janssen, 2012).

This special issue therefore invites papers that address open research questions that were posed in two recent Viewpoint pieces in Government Information Quarterly by Janssen & Van den Voort (2016) on adaptive governance and by Mergel (in press) on agile government. Adaptive governance should ensure that an organization is able to deal with the changes, while protecting it from becoming unstable. The main characteristics of adaptive governance are decentralized bottom-up decision-making, efforts to mobilize internal and external capabilities, wider participation to spot and internalize developments, and continuous adjustment to deal with uncertainty (Janssen & Van den Voort, 2016). An agile government introduces user-centric software development approaches implemented together with agency-based project managers to shorten the implementation cycle, improve the outcomes of IT projects, and make sure that user needs are considered (Mergel in press).

For this special issue, we welcome conceptual, empirical, qualitative, quantitative or mixed methods research papers. Topics may include, but are not limited to, the following:

  • Conceptualization of agile government and adaptive governance, implication, benefits and theory building;
  • Specific or distinguishable agile software development approaches for governmental organization and/or digital public service;
  • Agile software development project management (e.g. Scrum method) in governmental contexts;
  • The impact of applying agile government or adaptive governance on the culture, organizational structure, business processes and individual behaviors;
  • The impact of agile government and adaptive governance on policy-making processes, including information acquisition, negotiation, policy formulation, evaluation and examination;
  • Information sharing and organizational learning in agile government and adaptive governance environments;
  • Adaptation at different levels, traceability and accountability in agile government and adaptive governance projects;
  • Principles and approaches to enable/increase adaptability;
  • Coordination/mediation mechanisms in adaptive governance;
  • Pros and cons of adaptability, barriers and drivers, challenges and opportunities, balance between adaptability, stability, and accountability;
  • In-depth and comparative case studies of agile government and adaptive governance in public sector; and
  • Whether, and how, agile development approaches lead to user-centric digital government services, processes, and applications.

Special Issue Guest Editors:

  • Ines Mergel, University of Konstanz, contact: ines.mergel@uni-konstanz.de
  • Yiwei Gong, School of Information Management at Wuhan University, contact: yiweigong@whu.edu.cn
  • John Bertot, iSchool at University of Maryland, contact: jbertot@umd.edu

Special Issue Format

Each submission is subject to a rigorous double-blind peer review process with at least two independent reviewers. Authors can contact the guest editors for additional information.

The deadline for manuscript submission: January 1, 2017 Extended Deadline until February 15, 2017

References:

Gong, Y., & Janssen, M. (2012). From policy implementation to business process management: Principles for creating flexibility and agility. Government Information Quarterly, 29(Supplement 1), 61-71.

Janssen, M., Van de Voort, H. (2016): Adaptive governance: Towards a stable, accountable and responsive government. Government Information Quarterly, 33(1), 1-5.

Mergel, I. (in press 2016): Agile innovation management in government: A research agenda. Government Information Quarterly, 33(3), 516-523.

Using social media metrics and big data analytics for actionable insights

Generate actionable insights from social media and big data

Oftentimes social media use is seen as fundamentally different from other forms of formal organizational communication because of its speed, dynamic, egalitarian nature, but also its informal form. It’s important to align the use of social media with the organizational mission. Citizens can passively absorb the information and government can abandon other forms of publication and save taxpayer dollars. Public affairs can design campaigns to gain attention for certain issues, highlight deadlines in all phases of the policy life cycle (Lasswell 1951), or increase participation in offline behavior and online votes (increase turnout). The result might be an increased acceptance of public policy, increased inclusion and reduction of inequality of access (Thomas 1993, Bingham, Nabatchi, O’Leary 2005). Government organizations have the opportunity to diffuse misinformation and rumors, lower the costs of negative campaigning by quickly injecting correct, formal information, and bring in innovative knowledge about and from stakeholders.

Suggestions for practitioners aiming to measure the impact of their social media activities:

  1. Understand what you are trying to accomplish (increase attention, target certain constituencies, and what it looks like if you succeed with your communication) (DiStaso, McCorkindale, and Wright 2011). How are your social media activities supporting the organizational mission and to which extent do they serve you for example to become more innovative?
  2. Define a social media strategy and insights that optimize approaches to achieve the goals (Mergel 2012).
  3. Develop measures that are focusing on behavioral outcomes and not just reach. Have your posts and online interactions helped citizens change their behavior? Did they go out and participate in an initiative for which you need citizen input? Did they change their behavior by applying for a program?
  4. Display the information on dashboards that are accessible and understandable for decision makers (like the CDC’s social media dashboard) to see immediately how citizens are perceiving the information that is sent out by your organization (DiStaso, McCorkindale, and Wright 2011).
  5. Use the insights to optimize your tactics and identify actionable opportunities for program adjustments (Murdough 2009).

The following flowchart summarizes the steps outlined above:

measurementframework

New article published: Agile Innovation Management in Government: A Research Agenda

screen-shot-2016-09-13-at-8-00-37-amI wrote a paper based on my interviews with CTOs and digital service innovators in the U.S. federal government. The goal of the paper is to bring together the elements that lead to innovations in digital service delivery. I contrast traditional software development processes with elements of an agile innovation management approach. The result is a research framework and research questions for future explorations:

Abstract
Governments are facing an information technology upgrade and legacy problem: outdated systems and acquisition processes are resulting in high-risk technology projects that are either over budget or behind schedule. Recent catastrophic technology failures, such as the failed launch of the politically contested online marketplace Healthcare.gov in the U.S. were attributed to an over-reliance on external technology contractors and failures to manage large-scale technology contracts in government. As a response, agile software development and modular acquisition approaches, new independent organizational units equipped with fast reacting teams, in combination with a series of policy changes are developed to address the need to innovate digital service delivery in government. This article uses a process tracing approach, as well as initial qualitative interviews with a subset of executives and agency-level digital services members to provide an overview of the existing policies and implementation approaches toward an agile innovation management approach. The article then provides a research framework including research questions that provide guidance for future research on the managerial implementation considerations necessary to scale up the initial efforts and move toward a collaborative and agile innovation management approach in government.
Reference: Mergel, I. (2016): Agile Innovation Management in Government: A Research Agenda, in: Government Information Quarterly, 33(3), pp. 516-523.
http://dx.doi.org/10.1016/j.giq.2016.07.004.

New paper: #BigData in Public Affairs published in PAR

screen-shot-2016-09-13-at-8-03-17-amKarl Rethemeyer, Kim Isett, and I just published a new paper in Public Administration Review with the title “Big Data in Public Affairs“.

Our goal for this article is to define what big data means for our discipline and raising interesting research questions that have not been explored yet. Here is the abstract of our article. Please email me if you can’t access the full paper:

This article offers an overview of the conceptual, substantive, and practical issues surrounding “big data” to provide one perspective on how the field of public affairs can successfully cope with the big data revolution. Big data in public affairs refers to a combination of administrative data collected through traditional means and large-scale data sets created by sensors, computer networks, or individuals as they use the Internet. In public affairs, new opportunities for real-time insights into behavioral patterns are emerging but are bound by safeguards limiting government reach through the restriction of the collection and analysis of these data. To address both the opportunities and challenges of this emerging phenomenon, the authors first review the evolving canon of big data articles across related fields. Second, they derive a working definition of big data in public affairs. Third, they review the methodological and analytic challenges of using big data in public affairs scholarship and practice. The article concludes with implications for public affairs.

Reference:

Mergel, I., Rethemeyer, R. K., Isett, K. (forthcoming): Big Data in Public Affairs, in: Public Administration Review, DOI: 10.1111/puar.12625.

Award: Research stipend from IBM’s The Center for the Business of Government

 

IBM – The Center for the Business of Government has announced a new round of winners of their research stipends. I won an award to write about my research on digital service transformation in the U.S. federal government.

Here is the announcement text:

The Center for The Business of Government continues to support research by recognized thought leaders on key public management issues facing government executives today.

The Center for The Business of Government continues to support reports by leading thinkers on key issues affecting government today.  We are pleased to announce our latest round of awards for new reports on key public sector challenges, which respond to priorities identified in the Center’s research agenda. Our content is intended to stimulate and accelerate the production of practical research that benefits public sector leaders and managers.

My report will focus on the following topic: “Implementing Digital Services Teams Across the U.S. Federal Government”

In 2014, the White House created the U.S. Digital Service team and the General Services Administration’s 18F group. Both groups are using agile software development processes to design and implement high-profile software projects. The results of this report include lessons learned during the scaling up efforts of digital service teams across the departments of the U.S. federal government. These will focus on managerial design aspects, organizational challenges, motivations of digital swat teams and their department-level counterparts, as well as first outcomes in the form of digital service transformations in each department. This research report aims to support the presidential transition team’s efforts by outlining the current efforts of scaling-up digital service teams and their lessons learned, as well as observable outcomes of digital service teams across the U.S. federal government.

Congressional hearing about @18 and @usds operations and @gao report

I just watched the Congressional hearing of the 18F and U.S. Digital Service Oversight committee. The hearing was initiated by a GAO report titled: “DIGITAL SERVICE PROGRAMS: Assessing Results and Coordinating with Chief Information Officers Can Improve Delivery of Federal Projects” published on June 10, 2016.

The report showed that most agencies were fully satisfied with the digital swat teams that helped them fix their IT problems:

Screen Shot 2016-06-10 at 11.16.15 AM

The GAO inspectors talked to four CIOs (DHS, DOD, VA, DOS). The first three were fully aware and happy with the services 18F and USDS provided to their agencies. The State Department’s CIO was not aware enough, but then paddled back and said in follow-up discussions that he had actually been satisfied and involved from the beginning. Nevertheless, GAO found that CIOs need to be fully involved and IT acquisition should not happen behind their back – based on a statement of one CIO who had forgotten about the initial meetings he was involved in.

18F and USDS were criticized by the private sector witnesses for their opaque operations, vague agreements, and that they introduce an agile delivery BPA process that not all vendors and contractors want to follow due to intellectual property right protection. The agile blanket acquisition agreement encourages future contractors to showcase their code and ability to use agile methodologies in order to comply with the draft open source policy and lightweight production cycles. Vendors who don’t want to participate won’t be able to be involved in selected future IT acquisitions. Clearly that raises red flags on all sides, but moves government IT acquisition toward a disruption of the clearly broken IT acquisition process.

Communication and transparency are huge factors in explaining how a young start-up inside of government functions, moves their operations along, and comes up with oversight and accountability procedures and structures. The 18F blog is a valuable resource for a general audience, but I do believe that there is an industry-inherent over-reliance on publishing code and text on the social coding side Github. IT professionals value this resource highly, will find code, reuse it, or help the federal government to improve the code, but I don’t think that the community can expect Members of Congress or GAO inspectors to learn and read Github updates. This is where the bureaucracy meets the digital swat teams and more communication is necessary.

I was dismayed to hear the low profile USDS and 18F were keeping in their testimonies. There is so much more data out there that was already published on non-traditional outlets, such as Medium or blogs, that clearly shows how many millions of dollars the digital teams have saved the agencies they worked with. Why not show the numbers? No one else can show them except for those teams that have actually worked on comparing vendor data with digital service team data. Do it!

Generally, I recognize the statements of the private sector witnesses as a sign that they fear the disruption that 18F and USDS have started. This is a good thing — but needs to be aligned with the expectations and regulations that are there to protect government and its citizens against rogue behavior. As the chair of the oversight committee said: “Y’all should be holding hands and work on this together, because you y’all have the same goal.”

 

[Will update this post as process some of the statements a bit more]

Award: Emerald Group Publishing Award Citations of Excellence winner 2016

The Emerald Group Publishing has awarded our paper “A Three-Stage Adoption Process for Social Media Use in Government” in Public Administration Review as a  Citations of Excellence winner 2016! Very excited and grateful for all the authors who were willing to cite our paper:

Screen Shot 2016-05-20 at 3.14.34 PM

Suggestions for a Big Data curriculum for public managers

Screen Shot 2016-05-10 at 10.49.08 AMThe Journal of Public Affairs Education just published a symposium on Information Technology and Public Affairs Education. The symposium combines articles with a broad range of viewpoints, including IT skills and competencies, challenges adopting new technologies such as GIS, and how these topics can be integrated into the MPA curriculum.

I contributed a paper titled “Big Data in Public Affairs Education”. I found the topic interesting because it is challenging program directors to reconsider the existing MPA curriculum: There is a lot of conversation that MPA programs should focus on emerging topics that are now relevant to the challenges public sector employees face in their day-to-day operations. These topics include the use of Internet-generated data to combine them with administratively collected data and display them on dashboards for real-time decision-making. But also other topics, such as using GIS technology and sensors to make cities ‘smart’. There is usually very little room in the standard curriculum to integrate these topics.

Therefore I decided to review the existing literature, show what is already taught in MPA programs and where the gaps are, and then create a syllabus tailor-made for future public managers. My opinion is that big data is not an IT topic – and it makes little sense to compete with Data Science programs at iSchools or computer science departments. Instead, it is a very real management problem and Screen Shot 2016-05-10 at 9.37.43 AMshould be taught using a critical management perspective.

I used Mason’s PAPA model to cover different dimensions of the issue and provided literature and cases that cover 13 different modules for a semester-long course on Big Data Management in the Public Sector.

The journal is open access. The full symposium is available here and my article on Big Data is available here.

Here is the abstract:

ABSTRACT

Public affairs schools face the challenge of including emergent topics in their curricula to prepare students for the public sector job market. Some such topics reflect advances in the use of information technologies; others reflect updates to industry standards or changing needs of public sector information management professionals. This article focuses on big data that are created through citizens’ use of new technologies and the combination of administratively collected data with online data. Big data require changes in government information management skills, including collection, cleaning, and interpreting unstructured and unfiltered data; real-time decision making based on early signals and patterns that emerge; and new organizational roles and tasks, such as open innovation and change management. This article reviews the existing literature, compares big data requirements in neighboring disciplines, and suggests 13 modules for a big data syllabus that extend Mason’s PAPA model of ethical considerations for the information age.

Please cite as:

Mergel, I. (2016): Big Data in Public Affairs Education, in: Journal of Public Affairs Education, 22(2), pp. 231-248.