Frequently Asked Questions (FAQs) about AuditEngine
Q: How does AuditEngine work?
Simply stated, AuditEngine uses the ballot images of each ballot that are now produced by voting system when the ballot is first scanned, and we provide a detailed independent tabulation of the results. AuditEngine compares this independent tabulation ballot-by-ballot with the results of the voting system to find those exact ballots where we differ. Our system works best if we audit all contests because it gives us the most information about the marking habits of each voter so we can tell the difference between a true mark versus a "hesitation mark" or crease in the ballot.
AuditEngine runs in the cloud, where we can harness the power of large data centers. We are currently authorized to run up to 10,000 computers in parallel to complete the analysis of the images in a very short period of time, typically less than 15 minutes per run of each stage.
Q: What are "phases", stages", and the "processing pipeline"?
AuditEngine uses a "processing pipeline" which consists of individual "stages". Each stage takes inputs and produces outputs, and once the data is created by that stage, it is not altered later. A later stage can run as soon as data produced by an earlier stage is created. This structure is best for an auditing platform because we can check the validity of each stage output based on the set of inputs. Each set of data is tracked using cryptographic hash values to detect if the data has been changed from when it was originally built.
The stages involved in any given audit will vary depending on the data provided, the voting system vendor, reports required, and other considerations. For the most part, the stages included in any given audit will be determined automatically based on the type of audit, vendor system, and specific audit settings. There are also a number of tools available that operate outside the scope of the pipeline and use data generated by the pipeline but do not alter it. There is an opportunity for others to contribute plugins to perform additional processing and reports either in or outside the pipeline.
These stages can conceptually be combined into "phases" which are related also to what must be done to conduct the audit. AuditEngine can be conceptually viewed as four phases:
1. Setup and Upload Data, Perform Consistency Checks: Create the Election and an Audit Job on our website and upload ballot images, cast vote records, and PDF ballot style masters, and any aggregated reports that are available. Create a Election Information File (EIF) which provides the official names of the contests and options. This is normally determined by the CVR and then edited by hand for the exact strings used on BMD ballots. The components of this phase became more involved than we expected due to variations of the data provided, particularly when there are repeated ballot images, differences between the CVR and ballot images, and the like. The result of this phase can be an invaluable check that the data matches other aspects of the election, such as the official number of ballots cast.
Stages in this Phase:
- precheck - simply list all the input files associated with the audit.
- gen_biabif - review all the images files and extract metadata, including filenames, sizes, and any election metadata such as precinct, groups, etc. that may be encoded in the pathnames. The "BIF" is the ballot information established as a set of CSV (character separated values) tables. Any records where duplicate ballot_ids are located are segregated out in another residue table.
- cvr_to_eif - if the CVR is available, then the EIF (Election Information File) can be generated directly from the CVR data. This produces the DRAFT_EIF. This may need to be amended by hand to include the exact strings used on BMD ballots or the number of write-ins if that is not directly provided.
- parse_eif - After any changes have been made, then it is parsed and incorparated as the constests_dod.json file.
- preparse_cvr - for ES&S, the CVR is provided in .xlsx format with some issues regarding column names and embedded images. These are preparsed to create simpler and easier to handle CSV format files. For Dominion, this stage is not needed.
- gen_cvrbif - Create the cvr_bif, which is the metadata from the CVR, for each record in the CVR. Please note, the cvr_bif records are in a different order from the bia_bif. Again, these tables are in CSV format.
- combine_bifs - This stage essentially performs a full outer join of the biabif and cvrbif to create the biacvr_bif. This contains all the records that are in the BIAs amended with metadata from the cvrbif.
- gen_fullbif - This stage is not always required and cost saving can result if we have full style information for each record in the CVR. If not, this stage essentially generates that style information. If we are unsure the how the style information in the CVR relates to the style information from the ballot images, it is necessary. It involves a fairly expensive ballot image evaluation of all hand-marked (nonBMD) ballot images to extracts only the style indication. This stage does not process BMD ballots. As a result, the full_bif is created, which now includes the metadata extracted from the images for nonBMD ballots. This stage is parallel processed in the cloud with up to 10,000 computers and usually 100 images processed each.
- create_bif_report - At this point, the BIF report is generated which provides a summary of all the metadata and provides a chance to check the consistency of the data between the CVR, BIA listing and ballot images, and to help determine how ballot style information will be handled.
2. Create Style Templates and Map the Styles: Ballot image data is processed to create style templates based on the style strategy determined by reviewing the bif_report. Style templates are generated from the ballot image data itself, by combining up to 50 base images to clarify and improve the fidelity of the templates.
Note: To be able to expedite turn-around during elections when AuditEngine is used in cooperation with election districts, this phase is completed in an audit using the Logic and Accuracy Test (LAT) data, and then in the real election, AuditEngine can be configured to only use import_targetmap from the LAT audit job, and the subsequent stage will be extractvote. This eliminate the time consuming stage involving human-eye assistance to map the data.
Stages in this phase:
- build_template_tasklists -- by reviewing the full_bif with style information determined, these are grouped to provide a list of 60 ballots which are candidates to combine, if at least 60 ballots are available. Otherwise, it lists as many as are available.
- gentemplates -- This stage combines the best up to 50 candidate ballots to form clarified and improved template images for each style. This is run in parallel, with one computer allocated to each style.
- create_templates_report -- This stage creates a report of the operation of the gentemplates stage.
- build_target_mapper_package -- After reviewing the quality of the templates, then the data is packaged up for use by the TargetMapper App.
- Run TargetMapper -- This is an AuditEngine app which provides a user-friendly and powerful interface to generate the target map. This can be a time consuming step, depending on the number of styles and the complexity of the election. For average elections, this can take about half a day, including checking the redline_proofs.
- import_targetmap -- the file 'targetmap.json' is imported and checked for consistency. If there are any mapping issues, then this will generate an error message to help the user locate the mis-mapping. If an error occurs, then the user Runs TargetMapper again, unlocks it to allow changes, and makes the corrections, then relocks it, and then this stage is run again until no errors remain.
- create_all_redlined_proofs -- These images are derived from the templates and then red annotation is added to allow them to be checked. This stage simply creates the images.
- create_styles_report -- This report provides a list of all the styles and provides the redline proof of that style to allow for quality assurance checks of the mapping.
3. Vote Extraction: Process all images again, but this time including BMD ballots, and create an independent tabulation. The CVR is not used at all during this phase.
Stages used in this phase:
- extractvote -- This stage uses the map which was imported in the prior phase and fully checked for consistency. It pulls the ballot images directly from the ZIP archives and extracts the votes from each ballot, including BMD and nonBMD ballots, and creates the 'marks' data file which provides the evaluation of every target or BMD ballot selection. AuditEngine does this step using parallel processing in chunks of usually 100 ballots per chunk in one computer, using up to 10,000 computers in parallel. It uses adaptive thresholding to convert the marks into votes, and uses a set of heuristics to make good guesses when there are hesitation marks or cross-outs. Performs OCR on all BMD ballots to "read" the text so the QR codes are not relied upon.
- gen_extractvote_delegation_report -- This report details the status of all delegations and CPU time used for each one.
4. Comparison and reporting: The tabulation created by AuditEngine is then compared ballot-by-ballot with the official results and any variants and disagreements are categorized into more than 40 categories, followed by automated report generation.
Stages used in this phase:
- cmpcvr -- short for "compare CVR". This processes the result of the extractvote stage by comparing the evaluation of each ballot with the official result. For Dominion, it also compares with the pre-adjudicated and post-adjudicated snapshots in the CVR.
- gen_cmpcvr_report -- the produces the full discrepancy reports, including pie charts and reports by precinct and by contest. Details the first (usually) 50 variants from the first 10 contests, the closest 10 contests, and the 10 most variant contests, as well as the 10 most variant precincts. This is a lengthy report that is best viewed on the web to be able to look at the details of each variant ballot.
- gen_source_audit report -- this compares the aggregated totals from the 'source' archives with the official results. The 'source' archives are the main ballot image archives.
- gen_final_report -- This generates a short report which provides links to the other reports.
- hard_lock_job -- This stage simply locks the job so it cannot be altered in the future. This is a sematic lock.
Other Optional Stages:
There are a number of optional stages that are sometimes used:
- gen_verify_bia_bif -- this stage processes verification images which are scanned using scanners that are not part of the voting system, to provide a check on the images to detect possible image manipulation.
- extract_verify -- this stage is like extractvote but it uses the verification image archives. The verification images use the same map which was generated for the 'source' (main) images.
- gen_verify_audit_report -- this compares the verification samples with the official results on an aggregated basis.
Q: What are the "AuditEngine Apps"?
AuditEngine has a number of browser-based apps that assist the user to conduct the audits. These apps include:
-
AuditEngine frontend -- this browser app provides for:
-
creating districts
- creating elections within those districts
- uploading files related to each audit, including:
- ballot images in ZIP archives (BIAs),
- cast vote records (CVRs) (possibly in zip archives)
- ballot style masters (BSMs),
- official aggregated results
- creating audits for those elections
- establishing the setting for audit executions
- running each stage of those audits
- viewing reports and intermediate results
-
locking and unlocking audits
-
TargetMapper -- This browser-based app assists with the mapping of contests and options, as they are named in the CVR, to the targets as they are shown on hand-marked paper ballots, for each style. This app has copy and paste operations that can expedite the mapping. Sometimes there are 100s or 1000s of styles.
This app can run in a number of modes:
- Ballot Images Only -- If the CVR is not available, then the mapping can still occur, and the user will essentially build the list of contests on each style by choosing them from all contests for that style. This takes additional checking because of the lack of information from the CVR regarding the contests on each style.
- Ballot Images and CVR -- In this mode, the CVR is available, and thus the contests that are on each ballot style are known (as the contests in each style are derived from the CVR), and this can reduce error in mapping the styles. Since many styles are similar on a given side of a sheet, it has "Paste-Similar" which can find the similar styles on that side and paste the contests. This must be checked later but it usually is accurate. Worst case, there must be a mapping operation for each style.
-
Ballot Images, CVR, and Ballot Style Masters -- In this mode, the full ballot styles masters are available, and they provide the location of each contest "rendition" on the ballot. Once a ballot rendition is found and mapped, it can be checked that it matches the rendition on other styles, and it can be quickly mapped. Worst case, there must be a mapping operation for each contest rendition.
- This mode can also be used if the ballot style masters do not have timing marks or style encoding, as long as there is a legible style indication on each style that can be extracted by OCR or read by human eye. However, it is much better if the ballot style masters are complete, with timing marks and the barcoded information that represents the style.
-
AdjudicatorApp -- This browser app provides a user interface to check flagged comparison results. It presents the evaluation by AuditEngine and the evaluation by the voting system and that portion of the ballot showing the users marks or BMD card. The AdjudicatorApp can also be used without any voting system results to fine-tune the results by AuditEngine by reviewing any records that have been flagged for further review.
-
TallyApp -- This app is usable by individuals or teams that want to work to evaluate the vote by human-eye. It is similar to the AdjudicationApp but it does not do any comparison, and all or a subset of contests can be entered, and they can be random sampled to conduct a statistical check of the results based on the ballot images.
Q: What reports does AuditEngine generate?
There are a number of primary reports, and these have a number of subsections that may otherwise be considered separate reports. In addition, each stage may have a report of its operation, especially if it uses a delegation to a fleet of compute instances in the cloud for processing. Then, there are csv files which are the results of each stage and potentially used for the next stage, which are all observable.
The primary reports and their sections are as follows:
- Precheck Report -- This report simply lists all input files specified and provides their hash values.
- Ballot Image Archive Metadata Report
- Provides Ballot Image Archives Summary, which are cumulative values across all archives
- Raw ballot_ids, unique ballot_ids, Repeated Ballot_ids, Number of Precincts, Number of Parties, Number of groups, Number of batchids, Number of BMD ballots detected, Largest ballot image, Smallest ballot image, Number of huge files.
- Table of metadata about each archive:
- archive basename - this is the file name of the zip archive.
- files - number of files
- images - number of image files
- raw_ballotids - total ballotids even if they are repeated.
- ballotsids - net number of unique ballotids
- repeats - ballots that are the first ones if they are repeated. Repeats can occur most easily across several archives, but can even occur in the same archive if the pathname has more components. For example, the ballot could be /precinct25/93457i.pdf and /precinct25/precinct25/93457i.pdf, with two 'precinct25' components in the path name, but the file is identical. If repeats are found, then they should be checked to make sure they actually are repeated. Please note that the same ballot could be scanned twice and given two different numbers and be hard to find because the files are also digitally different.
- skipped_repeats - ballots were the same number as a ballot already encountered, and so these are skipped. These records are moved to the residue table and are not included in the main table.
- bmds - The number of BMD ballots in this archive.
- precincts - the number of different precincts in this archive.
- parties - the number of different parties in this archive.
- groups - the number of different groups in this archive.
- batchids - the number of different batchids in this archive. For ES&S, this will normally be 0. For Dominion, the batchid is defined as the TabulatorId and the BatchId combined to create a unique batch indicator.
- lowest_ballotid, highest_ballotid -- This makes the most sense for ES&S because they use sequential integers for the ballotid.
- Plot of the Distribution of Ballot Image File Sizes -- This is a histogram plot also showing thresholds.
- Ballot Image File Size Extremes - This table provides metadata for the smallest and largest ballot image files.
- Sheet Sizes - The count, average size, and standard deviation (StdDev) for each sheet, BMDs and all.
- Sanity Check Calculation - a calculation is attempted to estimate the missing or extra ballot images vs. ballots cast.
- Election Information File (EIF) Parsing Report -- This report provides for each contest, the name of the contest, the bmd_contest_name which is found on BMD ballots, The ballot summary text for each contest name.
- Ballot Information File (BIF) Report (Metadata Report)
- Metadata Summary
- Printed County Report - this report displays any printed county names that differ from what
- Card code to style_num analysis
- Precinct Report
- Ballots with alignment errors or unreadable barcodes
- Templates Report -- Shows the ballot_ids for the ballots used to create the templates, and each template.
- Style Redline Proofs Report -- Same as the Templates repot, but this time with the contest and options each outlined with red boxes and the name of the contest and option written near it.
- Discrepancy Report -- This is the result of comparing the votes extracted by audit engine with those extracted by the voting system. This is a lengthy report and is not suitable for printing. It includes the following:
- Introduction to make sure the reader understand our terminology.
- Metadata Summary: This metadata summary includes comparison counts between the CVR, Images and Cast.
- Summary of Discrepancy Records:
- High-Level Reconciliation by sheets and by contests, including pie charts.
- Audit-Engine Flagged Report, by sheet and by contests, including pie charts.
- Contest Variants Breakdown, by sheet and by contests, including pie charts.
- Normal Disagreed (No write-ins or overvotes) by sheet and by contests, including pie charts.
- Non-additive Groups - including Contest Variants, Disagreed, Ballot Variants, uncategorized (should be 0) and Blank sheets.
- Ballot Variants
- Detailed groups
- Write-ins Detailed, by sheet and by contests, including pie charts. Includes both agreed and disagreed writeins. Please note that AuditEngine does not review detailed writein information as this is normally done extensively by the election office using human eye.
- Overvotes detailed, by sheet and by contests, including pie Charts, agreed and disagreed.
- Gray Flagged agreed votes.
- Relevant Settings.
- Contest Discrepancy Table. Each contest is summarized as one line in a table, with the following fields:
- Total: Total ballots cast which included this contest with images that were processed by AuditEngine.
- NonVariant: Ballots with this contest where the official outcome and the evaluation by AuditEngine agreed, and did not include write-ins, overvotes, and were not flagged as 'gray'.
- Agreed Overvotes: Ballots where both AuditEngine and the voting system detected an overvote.
- Agreed Write-ins: Ballots which included write-ins in terms of a marked target, but where the name written-in may not be a qualified write-in candidate, and the write-in may be correctly attributed as a vote for a listed candidate.
- Agreed Undervotes: Undervotes are very numerous and we do not break those down here, and are not included in the discrepancy report unless they are disagreed or gray flagged.
- Disagreed: These are ballot-contests which were not initially evaluated as overvotes or write-ins, and where the evaluation by AuditEngine disagrees with the voting system.
- Gray Only: Ballot-contests where AuditEngine detected an ambiguous mark on this contest or used heuristics to decide voter-intent. This column omits any ballots which are in the columns for write-ins, overvotes, or disagreed ballots, even if AuditEngine internally flags them as gray.
- All Variants: Ballots with Agreed Overvotes, Agreed Write-ins, Disagreed, or Gray Flagged. The number of ballots cast should equal the sum of "Plain Agreed", and "All Variants". The components of this column are highlighted.
- Disagreed% of Margin This provides a good measure of whether the variants may have any impact on the outcome, and the highest five values are highlighted. Further analysis is still required to see if the disagreements will reduce the current margin of victory.
- Variant% of Margin This provides a maximum measure of whether the variants may have any impact on the outcome, and the highest five values are highlighted. Typically, the vast majority (perhaps 90%) of All Variants are Agreed Write-ins and Agreed Overvotes which may only rarely result in any changes in the outcome.
- Vote Margin: This is the margin of victory, i.e. gap between votes for runner-up and winner (lowest winner if contest has multiple winners) among the ballots processed, and may be a subset of the total margin for the entire district if AuditEngine did not receive or process all ballot images.
- Contest Details: Each contest is then reviewed in detail. To limit the size of the report, contests are only detailed if they are one of the first 10 contests, the closest 5 contests, or the top 5 contests with the most variants. Also, any contest of interest can be reviewed in detail.
- CVR results for this contest.
- Summary of the comparison results for this contest (same as the line in the Contest Discrepancy Report)
- Disagreed ballots by group, detailed by record types. These are summary tables for each record type. Click on group designation and it will go to the individual records.
- Normal Disagreed
- Write-ins
- Overvotes
- Gray-Flagged
- Individual records for each discrepancy. This is a lengthy section which shows the discrepancy record followed by the image of the ballot, front and back. Click on the thumbnail and the full size image is displayed in another window.
- Precinct Report Summary Table -- Each precinct is summarized as a single line in a table. Columns in this table are similar to the contests table.
- Precinct Details -- Precincts are detailed if they have the highest Disagreed% of Total or highest Variant% of Total. For each precinct, they are first broken down by group, then detailed to the ballot. Ballots are shown as thumbnails and can be viewed in full resolution by clicking.
- Final Report-- The final report provides top level metadata and links to other reports.
- Pipeline Report -- The pipeline Report provides the details about each stage in the pipeline, including hash values for each file used, and the status for each.
- Logs -- Logs are created with progress statements for each stage and for each ballot during extraction. There are two types of logs. The first are normal logs, and the second are exception reports. Exception reports are created when a ballot or condition is found to require additional special reporting, and frequently ballot images are saved in conjuction with these reports. For example, if a ballot could not be aligned, the this would prompt an exception report.
- Ballot Research Report -- Any individual ballot can be researched to provide the details from each of the stages about that ballot. Depending on the settings, this can also generate intermediate image processing data about each ballot as it is processed.
Q: What data is needed to run audits?
AuditEngine requires only a few data items to be exported by the election system, and we do this to minimize our reliance on the voting system. This makes our auditing solution more independent from other options that may require more data. But if we are using the "Cooperative Workflow" we can provide faster turnaround if we have a bit more from the election system so we can configure our system prior to the time when the real data becomes available.
- For Ballot Image Audits -- The normal data we need is, at a minimum, the following:
- Ballot image archives (BIAs), combined into ZIP archives, up to about 50K ballots per archive.
- Cast Vote Records Files (CVRs) -- in xlsx format (ES&S) or JSON format (Dominion)
- Preferably, Ballot Style Masters (BSM) as PDFs in searchable format
- preferably, with all timing marks and barcodes.
- or without timing marks or barcodes but with style designation shown on each style.
- To improve turnaround -- particularly if we are working with jurisdictions that want quick turnaround:
- Logic and Accuracy Test (LAT) ballot images archives (LAT-BIAs) and the corresponding LAT CVRs,
- ES&S: with "Ballot Style" field provided in the CVRs.
- Dominion: with JSON CVRs corresponding to the LAT ballots.
- BMD Strings: A list of the contest names and options as shown on BMD ballot summary cards, if different from the official names.
- To Run Verification Images -- independently scanned ballots aggregated to the same groups are are reported in the CVR.
- Digital Poll Tapes Audit -- For ES&S systems, we can run also a "Digital Poll Tapes Audit" which parses the digital poll tapes which can be exported from the ES&S EMS for each machine used in early voting or on election day, and comparing these with the aggregated totals.
Instructions: for Exporting data from the EMS
-
Sending Election Data to AuditEngine -- Includes:
-
Archiving the Data
- Creating a Hash Manifest File
- Providing the data:
- Posting the data - Election officials are now opting to post the data once for all requesters.
- Uploading - We can provide a county-specific upload link so they files can be easily uploaded.
- Using USB Thumbdrives
- Using a "Jump Drive"
Q: Does AuditEngine have a good track record?
AuditEngine is relatively new, largely because the ballot images it uses in its review are only recently available on a widespread basis. However, we have recently completed a very thorough case study of the platform on three counties in Florida: Collier, Volusia, and St. Lucie. This case study provides evidence of the very high accuracy of AuditEngine, where it agrees with the voting system more than 99.7% of the ballots, and when we disagree with the voting system, AuditEngine interpreted voter intent 93% of the time correctly. In other words, we have proven that AuditEngine is more accurate in terms of automatically correctly interpreting voter intent than the voting systems.
In the case study of Volusia County, FL in the 2020 General election, we identified a number of discrepancies in how results were uploaded to the Election Management System (EMS). As a result, we now know that there are two internal tabulations in ES&S Equipment, and these two tabulation may grow to differ due to several error modes.
We identified 4,904 ballot images that were duplicated, due to a failure of the thumbdrive in an early voting precinct and then "clearing" the election and starting over. But the images were not properly deleted and neither were the CVR records of those initial ballots. One voting machine that was never correctly uploaded using the thumbdrive, resulting in 537 fewer ballot images than should have been provided. Despite these operational errors, the tabulation from the county appeared correct, because it was based on the aggregated totals rather than the CVR and ballot images, which differed, but were internally consistent.
These issues were detected not when the ballots were evaluated by AuditEngine, but rather during the metadata analysis phase.
We have also recently audited Bartow County, GA, which uses Dominion Voting Systems equipment and software. Also, during the development of the platform, we performed audits of elections in Dane County, WI, Wakulla County, FL, Leon County FL and San Francisco, CA.
You can read the case study report of three counties in FL and associated explanation videos on this page: https://copswiki.org/Common/M1970. Audits of two counties in GA and Dane County, WI are also available for
With that said, we must admit that AuditEngine is relatively new technology and the election field is highly non-standardized with proprietary voting systems and a vast number of different ballot layouts and conventions. Therefore, we do occassionally encounter a new sitation that requires additional software development or configuration changes.
Q: How can we trust the result of the audit by AuditEngine?
A: The premise of AuditEngine is complete transparency. We turn a black box into a transparent box.
The AuditEngine auditing system is simple in concept. We read the vote off each and every ballot image, and create an independent tabulation. Our system provides complete transparency, so you can take any ballot and follow it through the system. The system will find "disagreements", where the audit system interpreted the marks on the ballot differently from the voting system used by the jurisdiction. We will be able to manually inspect those ballot images and confirm how those ballots should be interpreted, and if we want, dig into the paper ballots and find those exact ballots. When we disagree with the voting system, AuditEngine correctly interprets the marks about 93% of the time, according to our recent case study in Florida, whereas the voting system interprets the same marks only 7% of the time. Typically, the disagreements are fewer than 0.25%, a quarter of one percent, depending on whether the voting system results were heavily manually adjudicated. Audit Engine tends to find incorrectly interpreted undervotes, where the voter made a mark that was intended for the candidate but was not sufficiently in the bubble. AuditEngine uses an "adaptive threshold" method which evaluates the marks based on other marks on the same ballot and the relative darkness or lightness of the ballot itself.
Q: How do we know the ballot images have not been altered?
- The proper ballot images from the election department, as exported by the "Election Management System" or EMS must be uploaded to the secure cloud data center used by AuditEngine. After being uploaded, the hash values are easily read in the listing of each file without any further processing. These hash values can be compared with the values produces with similar calculations by election officials to confirm that the image files are the same. The use of these secure hashes is commonplace and a well respected methodology.
The following references provide an overview of hash functions and their use in the Federal Rules of Evidence:
- "Why Hash Values Are Crucial in Evidence Collection & Digital Forensics" -- https://blog.pagefreezer.com/importance-hash-values-evidence-collection-digital-forensics
- Federal Rules of Evidence FRE 902(13) and (14) -- https://www.foley.com/en/insights/publications/2017/12/new-federal-rules-of-evidence-90213-and-90214
- The second level of this question has to do with whether a hacker has modified the images before they were captured by the election department, perhaps using a virus inside the voting machine itself. This may indeed be a hazard in the future when ballot image audits become commonplace. But in recent elections, no one expected a ballot image audit to be performed, and so if you assume a hacker or compromised insider wanted to modify the election, they would likely just modify the numbers in the election result (in the EMS database) rather than go to all the trouble of modifying the images, which is indeed a lot of work and may be obvious when the images are inspected. So for now, we can largely ignore the possibility that anyone would go to this expense.
- If the ballot image audit finds no inconsistencies, one option is to perform an independent rescan of the ballots using high-speed scanners that are not used in the election process, process those images using AuditEngine, and then compare the result of the tabulation on those batches. This process would detect any image manipulation that would alter the result of any contest.
- Furthermore, we at CitizensOversight are working to include cybersecurity measures to allow us to detect any modification of ballot images once they are produced. At this time, these measures have not been adopted in the standards nor incorporated by voting machines. We view such hacks at the time the image is created to be very unlikely, particularly if the image is scanned using commercial off-the-shelf (COTS) scanners that are not purpose-designed as a voting system.
- Even while knowing that modification of the images is very unlikely, we do advise that some paper ballots also be inspected and compared with the images to provide further confidence. Today, most districts perform a limited audit of the paper ballots. That audit also verifies the ballot images, because the images are used by the tabulators to determine the vote on each ballot. We also suggest that if AuditEngine finds batches that have disagreements in terms of voter intent, the paper ballots can be checked by locating the paper ballot and inspecing and compariing it with the ballot image. Doing this a few times provides a good sense that the ballots are indeed well organized and there is a correspondence with the ballot images.
- If a thorough hand count is performed, checking the result of that hand count on a batch-by-batch basis can help to eliminate the possibility that: 1) the hand count was incorrectly performed, 2) the hand count results were modified, and 3) the ballot images were modified (of course to the extent any hand count reviewed those ballots.) Thus if a hand count covered only one or two contests, then those contests can be compared with the results of the ballot image audit (which covers all contests.) In theory, image manipulation could occur in just those contests not hand counted, but modification of down ballot contests by modifying the images is even less likely.
- What we tend to find quite often is that there are inconsistencies in the ballot images in terms of the raw counts of images, if 1) some ballot images were copied twice into the set, 2) some ballots are rescanned to produce duplicate ballot images, or 3) some ballot images are missing when they are not uploaded to the EMS. (We found these exact problems in the Volusia 2020 General election, covered in detail by our case study results, read more here: https://copswiki.org/Common/M1970).
Q: Are there aspects of the election that AuditEngine does not include?
A: Yes. AuditEngine provides a consistency check between the ballot images, which are made very early in the tabulation process, and the official results, which are at the very end. Thus, it can detect most issued such as errors or malicious changes between these two checkpoints. It does not include many aspects of the election that do deserve scrutiny, such as voter registration, voter eligibility, paper ballot alteration, ballot harvesting, signature validation, campaign finance, inappropriate advertising, etc. The consistency check from ballot images to final result eliminate some of the most obvious security hazards. As we continue to develop AuditEngine, we will also be adding additional components where the horsepower of the cloud is beneficial.
Q: Does AuditEngine ever fail to process ballot images?
A: Yes. We find that some ballot images are distorted and poorly created by the voting system. This is particularly true with some older ES&S equipment.
Q: Do you need ballot masters for each style prior to running AuditEngine for a given election?
A: No, AuditEngine can operate without ballot style masters. However, we can generate the target maps much more easily if we have them, and with fewer human error mistakes. AuditEngine derives style masters from the images themselves, so it is not necessary to have all the ballot masters for each style. The helper app "TargetMapper" is then used to map the targets on the ballot to each style, contest and ballot option. If we can get the Ballot Style Masters, which are PDF files in "searchable" format, we can more easily extract the exact locations of each one of the target ovals and the associated text on the ballot. There is still an abbreviated manual process using the TargetMapper app to pair up the text used on the ballots and the text used in the cast vote records.
Q: How much time do you need in advance of the election to set up AuditEngine?
A: For audits conducted by the public using publicly available information, AuditEngine is typically deployed after the election when the results and ballot images have been finalized, or at least semi-final results have been published. However, it is helpful to have some experience with a given area and the specific methods used in any given jurisdiction by prior audits. By getting the Ballot Style Masters in advance, the Target Mapping phase can be accomplished prior to the election and be able to quickly process the ballot images.
When we are working with election districts and quick-turnaround is important, it is best if we are provided with ballot images and CVR from the Logic and Accuracy Test (LAT) with the Ballot Style Masters and create the mapping prior the the election. Then, when the election results are finalized, the system will be fully configured to accept the live data and produce the results.
Q: Does AuditEngine also audit "Ballot Marking Device" (BMD) ballot summary sheets?
A: Yes. Audit Engine "reads" the printed text rather than the barcodes. BMD ballots are those printed by systems that incorporate touch screens to allow the voter to make selections, followed by printing a voted selection summary card. This card, or sheet, includes linear or 2-D barcodes that provide a machine-readable representation of the selections by the voter. These barcodes typically are difficult if not impossible for voters to verify, and instead voters can only verify their selections in printed text. Thus, the part verified by the voter is not read by voting systems. AuditEngine stands alone in the field of ballot image auditing offerings because we perform OCR on the printed selections to determine the vote on the ballot rather than relying on the barcodes. Because we compare that result with the official result in the Cast Vote Record, this essentially puts a check on the possibility that the barcodes might say one thing while the text says something else.
Q: What voting systems do you support?
Currently, we support the two leading voting system vendors, Election Systems & Software (ES&S) and Dominion Voting Systems, and we are working to also support Hart Intercivic. We prefer to the latest generations of these systems which provide a ballot-by-ballot cast-vote-record (CVR) report of the voting system results so we can compare with the voting sytem down to the ballot. The older Dominion and ES&S systems do not provide that level of reporting even if they provide ballot images, and although we can process the images to product an overall tabulation, we can't compare on a ballot-by-ballot basis.
Q: How many people are involved in doing an audit?
We need at least one auditor to be in charge of each audit, plus a number of workers who can help with the mapping and adjudication process, to the extent those are required, and any number of observers. The amount of work required is highly dependent on the sheer number of ballots and the smallest margin of victory. If the margin of victory is fairly large, and if we find a relatively small number of disagreements, we may not need to review them all to conclude that the result is consistent. On the other hand, with a very close margin, every disagreement will need to be reviewed. If there are also a large number of write-ins, this can also increase the amount of work involved. At this stage, we are still evaluating how many people are needed in general.
With that said, we encourage the process of each stage of the audit to be witnessed by a set of interested parties in an observers panel, so they can have all their questions answered and the process can also be livestreamed to the public.
Q: Is AuditEngine "open source"?
Although AuditEngine uses a lot of open source software and we endorse standardization, at this time AuditEngine is not fully open source software. We are reviewing our options but at present we believe the most important aspect is providing "open data" transparency, so that anyone can check the data at each stage of the process. Open Source software works best when the users of those software modules are programmers who can then actively work to improve them. The users of AuditEngine are not programmers, and so providing open source software would not help verify the accuracy of the audit result. Plus, since the software runs in the cloud, it is very hard to prove that it is not changed from the open source that may have been inspected. Our philosophy is that it is more important for the data to be is open, and can be checked at intermediate locations along the way.
AuditEngine is designed to operate in a number of discrete stages. Each stage processes some input data and creates output data. Any ballot can be checked in any stage, and any ballot can be checked with a detailed single-ballot report.
AuditEngine has been run now on many millions of ballots, and any edge cases are quickly exposed in the operation of the software itself.
There is another aspect of open source which is perceived as a benefit in most situations: code sharing. Thus, in the open source world, if something has been developed it is commonly reused for many other purposes. Much of the code that comprises AuditEngine is single-purpose software. If it is reused, it will be reused for the same purpose by another entity.
We believe it is more beneficial that this software is not shared and instead, any another entity should develop their own auditing software which will have different characteristics. Using both auditing systems on the same elections provides the opportunity to compare the results of the two (or more) independently designed systems. This is a beneficial competitive process while sharing the underlying code does not provide this cross checking.
Q: How is AuditEngine funded?
We are pursuing a grass-roots funding model, where we can do fundraising for each audit from the general public, rather than relying on contracts with the same government entities we are auditing. We believe such contracts, unless carefully constructed, will result in the auditors preferably providing high scores to their clients. We believe that the cost of operating AuditEngine is low enough so that the public can fund each audit due to the interest in having an independent review. Please donate today!