Glossary
In the definitions below, lower-case terms are specific to AuditEngine and are not defined elsewhere, whereas Capitalized terms are used by the election integrity field, except for proper names of AuditEngine applications and components. Terms provided here have special or important meaning for AuditEngine.
A general set of election terminology can be found at this URL: https://www.eac.gov/sites/default/files/glossary_files/Glossary_of_Election_Terms_EAC.pdf
Adaptive Thresholding
AuditEngine uses a methodology called "adaptive thresholding" which can adjust the mark recognition thresholds based on the habits of the voter and the darkness of the scanned image. This methodology is used when marks are recognized from the ballot, and determines whether there is a mark at the target location. But it does not work if there are very few targets on the ballot, or if there are very few marks by the voter, and in those cases it is not used. See also Evaluation Heuristics which are used to evaluate whether the marks resolve into votes whether the contest is Overvoted or Undervoted. AuditEngine uses sets of larger and smaller evaluation areas to catch most circles and checks that are outside the Target symbol but are obvious voter intent.
Adjudicated / Modified Record
For some voting systems, such as Dominion when the JSON style CVR is used, the CVR may provide not only the original evaluation of the ballot sheet, but also a 'modified' record which includes all the contests on that sheet, and may be modified. The 'modified' record may modify zero or many contests on that ballot sheet after human-eye review by election staff. If the modified record does not show a change in a given contest, it is not possible to know from that record if the contest was inspected and confirmed or not.
Adjudication
This is the process of human review of ballots or ballot images and providing an interpretation of the vote on that ballot. Dominion has an adjudication module that facilitates adjudication by election staff. AuditEngine also has an AdjudiTally App that can help fine tune the results obtained by AuditEngine.
AdjudiTally App
This component of AuditEngine is a browser-based application that provides a user interface to check any ballot, particularly flagged contests or comparison results that are disagreed. The AdjudiTally App has a number of modes that operate from the same interface. It can be used to:
- select between the evaluation by AuditEngine or the evaluation by the voting system (per the CVR) by selecting the evaluation which is correct or entering it from scratch.
- fine-tune the results by AuditEngine without any voting system results by reviewing any records that have been flagged for further review.
- review ballot images without any results by AuditEngine or the voting system.
- view ballot images and tally the results either by a single person or by a crowd-sourced team.
Amazon Web Service (AWS)
This is a cloud-based service provider that provides a range of services for storage of digital data (S3) and compute services (Lambda), among others that are used by AuditEngine. In particular, AuditEngine uses up to 10,000 Lambda compute instances in parallel to quickly process 100,000s or millions of ballot images.
Archive / Zip Archive
A single file which can contain many smaller files and also may compress them. This makes it easier to handle a lot of separate small files and also usually saves a lot of space on the disk because files consume at least one block, and the last block in a file is partially empty. Archives pack all the files together into one file without any empty space.
Most importantly for this application is that archives make it easier to handle many thousands of individual files by grouping them together into a small number of archives. We require that you use the open source standardized "ZIP" archives, because the files can be individually extracted from the archive without extracting them all. The free application "7-zip" from 7-zip.org is a very good tool, but use the conventional zip format and not the proprietary 7z format. We recommend that up to about 50,000 ballot images are placed into the same archive, and they should be between about 5GB and 10GB in size for easy handling.
AuditEngine App
AuditEngine provides a user-friendly application that runs in the user's browser that assists with the aspects of running a job. To use the App, you must have an account and be authorized for the given activities. There are a number of menus that can be summarized as follows:
-
Districts -- A district is normally either a County or it may be an organization that wishes to use AuditEngine to process their organizational election. Each district has contacts and a location. Each district can have a common uploading area and a fixed link used for uploading.
-
Elections -- Within a district, elections can be defined. An election has a name and a date, and a related District. An election also has a number of files that must be uploaded from or by the District, including Ballot Image Archives, CVRs, Ballot Style Masters (BMS), etc. The uploading from a given district first goes to an upload folder before it is "adopted" and combined with a given election. AuditEngine has a convenient uploading function which allows the user to request the uploading of any number of files and then come back later, as long as the browser window remains active.
-
Audit Jobs -- For a given District and Election, there can be one or more Audit Jobs, and these will use the data uploaded to the Election. A given Audit Job will have a job_name and consists of a number of Phases and Stages in a Pipeline. The Audit Job will also have a Job Settings file to configure the job.
-
Users -- There is also the concept of Users, who can be related to a given district or Election, and each has permissions that can be adjusted to allow users to observe or assist in the audit.
Audit Job
For a given Election in a given District, there can be one or more Audit Jobs. An audit job tracks the specific stages completed in the processing pipeline. Each Audit Job for a given election will have a unique job_name.
See also District, Election, Job Name
Audit Phase
A logical major step in the pipeline, which consists of a number of stages, and where those stages can be grouped into a logical concept. To conduct Ballot Image Audits (BIAs), Audit Engine uses at least 4 and sometimes 8 phases. A given workflow may use more than one audit for a given election, starting with an audit of the Logic and Accuracy Test (LAT) data and using that to set up AuditEngine to be ready for the live election data when using a Cooperative Workflow.
The basic phases are:
- Setup and Upload Live Election Data, Perform Consistency Checks
- Create Style Templates and Map the Styles using live election data.
- Vote Extraction -- recognize the votes on all ballots, including BMD ballots primarily with the stage extractvote
- Comparison and Reporting -- See cmpcvr
- (optional) Scan Verification Batches and perform extraction and comparison.
If Cooperative Workflow is used, then there are 2 to 4 preliminary phases using Logic and Accuracy Test (LAT) data.
- Setup and Upload LAT Election Data, Perform Consistency Checks
- Create Style Templates and Map the Styles using LAT data.
- (optional) LAT Vote Extraction -- recognize the votes on all ballots, including BMD ballots.
- (optional) LAT Comparison and Reporting
- Setup and Upload Live Election Data, Perform Consistency Checks
- (not used -- already done) Create Style Templates and Map the Styles using Live Data.
- Vote Extraction -- recognize the votes on all ballots, including BMD ballots.
- Comparison and Reporting
- (optional) Scan Verification Batches and perform extraction and comparison.
Ballot
Generally a presentation of contests and options for each contest that can be selected by a voter, and the recording of those selections. The word "ballot" originally meant "little ball" and these were placed into a box to represent the votes of each person. In practice, a complete ballot may be a number of sheets, each with a front and back side, also called pages. However, in many instances, the term "ballot" may also be used to refer to a single sheet. Thus, "Ballot Images" should technically be "Ballot Sheet Images" because when there are multiple sheets, each image is only one sheet of the combined ballot. Sometimes, a ballot image is only of one side of one sheet (one page).
See Ballot Image.
ballot_id
A unique designator used by the voting system to refer to a specific ballot. It is either an integer or a compound value with parts. ES&S uses integers from 1 on up. Dominion uses a three-part number separated with underscores, like 02438_00043_293456, where the first part is the tabulator number, the second part is the batch processed by that tabulator (starts at 1) and then the final number is the RecordId, which sometimes starts at 1 and increments and other times is pseudo-random. This term is defined by AuditEngine. There may be gaps in these numbers and they are likely also defined randomly, esp. when ballots are cast in-person.
Ballot Anonymity
Privacy of the vote is an important goal of democratic election systems. This can best be defined as ballot anonymity, such that any ballot cannot be paired with the voter. Court cases that have examined this issue have concluded that it may be impossible to obtain absolute anonymity of all votes if they are cross-referenced with other data in a handful of other cases. For example, if only one voter votes in a specific precinct, using other data regarding who voted in that precinct can reveal who that voter is. But using only the ballot image data and the cast vote records, on their own, cannot be connected to any voter. Even if the voter writes their name on the ballot, we don't know if they wrote their name or someone else wrote their name on a different ballot.
Because there are these edge cases where the linkage between the voter and their ballot could be established if additional information were available, AuditEngine specifically does not process the List of Registered Voters nor the List of Voters Who Voted, within the operation of a ballot image audit, except to compare the aggregate numbers. The total number of voters that voted should be less than or equal to the number of registered voters and should be equal to the number of ballot images and the number of Cast Vote Records.
Ballot Image
Election systems today create digital pictures of both sides of each ballot sheet. These images can be exported by the Election Management System (EMS) in standard formats, such as PDF (Adobe Portable Document Format, used by ES&S generally with both the front and back in the same file), multipage TIFF (Tagged Image File Format; Dominion uses these, with three pages in one file, the front, back and AuditMark(tm) graphical image of the voting system evaluation), or PNG (Portable Network Graphics; Dominion uses this format, typically with all three pages combined into a single tall image.) Clear Ballot uses JPEG format files, and Hart uses TIFF format.
Please note that the term "Ballot Image" has a deprecated meaning. Previously, it was used to mean digital inputs by the user at a Direct-Recording Electronic (DRE) machine or another touch-screen machine. The term "image" is used in computer science to sometimes mean an exact copy of digital data. This prior use is now deprecated and the term is now widely understood to mean the actual digital pictures of the document rather than just a copy of digital data that does not represent a picture.
Ballot Image Audit (BIA)
A review of digital images of ballots in a jurisdiction of an election and comparison with the official outcome for each ballot, group of ballots (precincts or batches), or the entire jurisdiction. AuditEngine is a platform that can provide the computerized and human-interface processing for such an audit.
Ballot Marking Device (BMD)
Typically a touch-screen interface with a printer which allows the voter to select the desired option for each contest, and then print out a ballot summary card with those options printed on it, typically also with barcodes or QR Codes that provide the votes of the voter. This print-out is later scanned to produce the _cast vote record _for that ballot. Some BMDs print a ballot layout in the same format as a hand-marked paper ballot and cannot be easily distinguished from it rather than a ballot summary card.
Election officials may choose to use BMDs to reduce ambiguity of what is voted, including write-ins, because those are keyed in. However, BMDs provide lower verifiability. Voters tend not to check the printed ballot summary card for accuracy and the QR Code could potentially be different from what is printed. Also, there is also no way to verify that all the contests and options were presented to the voter in their private voting session by looking at the ballot itself. In contrast, a hand-marked paper ballot provides all contests and options in printed form and can be verified. Since the BMD device does not have any internal memory and produces a durable paper record, they are an improvement over Direct Recording Electronic (DRE) devices.
For ES&S and Dominion, they use barcodes or QR Codes, respectively, which unfortunately cannot be easily verified by the voter. Therefore, AuditEngine does not use the barcodes or QR Codes and instead uses OCR (optical character recognition) to read the selections from the human-readable summary, and this will detect the possibility that the barcode or QR code does not express the selections of the voter.
Ballot Style
A designation that represents the set of contests on the ballot, the language and in the case of a hand-marked paper ballot, the order voting targets. The contests on the ballot are determined by the voter's address and party affiliation, in the case of partisan elections where ballots differ by party. There may be 100s or 1000s of styles in any given election in a single jurisdiction. The method for designating style is one of the most complex aspects of a ballot image audit, because there are many variations that are up to election districts.
To deal with these many variations, AuditEngine has a number of additional terms:
card_code - the actual encoding on the ballot determined by the Election Management System (EMS) and usually unknown to election workers.
style_num - an integer that represents a given style, and may be either exactly the _card_code _or a deterministic conversion of that code.
hexstyle - a useful representation that avoids arbitrary numerical assignments of the style_num, and represents only the set of contests used on the ballot. See Hexstyle
pstyle_num - a printed style designation, usually a more human-friendly representation than style_num but with (usually) a 1:1 correspondence with the style_num. The pstyle_num can be extracted from the ballot using Optical Character Recognition (OCR) or from a barcode printed on the ballot. OCR is not perfect and errors may cause critical errors if misread.
Ballot Style Masters (BSMs)
PDF files in "searchable" format that are normally sent to the ballot printing contractor for printing of ballots in their final usable form. These files are used by the TargetMapper App component of AuditEngine to simplify the process of Mapping the ballot styles. Mapping provides the location of each active target oval or rectangle which is darkened by the voter to express their voting intent and associates it with the contest and option as specified in the cast vote record (CVR). Preferably, these masters should also include the timing marks and any barcodes to allow the extraction of the encoded style information. However, TargetMapper can operate without these if there is a legible style indication on the ballot that is unique for each style.
Ballot Summary Card
BMDs use a touch-screen interface and usually produce a Ballot Summary Card which includes barcodes or QR Codes that encode the vote so it can be quickly read by the EMS after scanning. Ballot Summary Cards also provide human-readable text under the barcodes so the voter can verify their vote. But most do not provide an easy mechanism for voters to verify that the barcodes correctly encode the vote.
Ballot Variant
A term used in the comparison process. If a complete sheet is blank or if the image was detected as corrupted, then this is classified as a ballot-variant. See also Contest Variant and cmpcvr.
Ballots Cast (Number of)
A ballot is "cast" when the ballot is inserted into the scanner or ballot box when voting in person, and when it is received and accepted as valid for absentee or mail voting.
Determining the number of ballots cast by looking at the ballot images can be difficult if there are multiple sheets. At times, voters do not return all sheets. A close count can be by the first sheet only. However, if anyone does not return the first sheet and returns only the second sheet, then it may be impossible to correctly calculate the number of ballots cast based on the ballot images provided. It is important to compare the official count of Ballots Cast to the List of Voters who Voted, combined with the count of protected voters that are not on that list.
Ballot Indexing File (BIF)
This term is used only in AuditEngine to refer to indexing and metadata files that are generated to fully index the ballot images, cast vote records, and metadata derived from those sources as well as a preliminary examination of ballot image to extract data in barcodes and in specific areas as metadata. Such an index can be compared to the index card system used in libraries to locate a given printed resource, and also provide some metadata about the resource, such as in a library whether it is a book or video, number of pages, author, etc.
There are the following forms:
-
bia_bif: ballot image archive ballot indexing file - this includes the ballot_id, where the ballot image can be found for each sheet (which archive and the path name) and any other metadata that can be derived by reviewing the list of ballots in ballot image archive ZIP files. This includes whether the ballot image was found to be repeated.
-
cvr_bif: location of CVR record within CVR files, and metadata available from the CVR record for each ballot sheet.
-
biacvr_bif: These are joined into a single set of indexing records organized by the same order as the ballot image archives.
-
blt_bif: These records are of the data derived by examining the ballot images but not extracting the votes. It includes card_code, pcounty (printed county name, if available), and possibly pstyle printed style designation, pprecinct - printed precinct designation. This set of records is not always separately saved, but is simply combined with the biacvr_bif to create the full_bif.
-
full_bif: This is the full set of indexing records, which may be the same as the biacvr_bif, if the data from the blt_bif is not included, or it may include the data from the full preliminary ballot image examination (blt_bif). From this set of records, the stage create_bif_report can be run so as to provide a full report of the metadata derived across the data available.
Barcode
A machine-readable graphical artifact, typically linear in form, commonly printed on BMD ballots. ES&S uses "Code-128" barcodes which can represent all 128 ASCII code characters (numbers, upper case/lower case letters, symbols and control codes). Code-128 is a linear barcode with vertical bars where the width of the bars and spaces determines what the bars encode. Three bars and three spaces encode each character. There is also a check digit that can detect if the barcode is corrupted.
AuditEngine does not rely on the barcodes to determine the votes on a BMD ballot summary card, but rather reads the printed strings using Optical Character Recognition (OCR). Sometimes, the term barcode is used as a generic term to mean any type of linear or 2-d (2-dimensional) code, such as a QR Code, or a Datamatrix code which is similar to a QR Code. 2-d codes have superior error detection and correction.
Both ES&S and Dominion use proprietary barcodes to encode the style on hand-marked paper ballots. See card_code.
card_code
This term is defined by AuditEngine to refer to an integer or binary code encoded on a ballot sheet, sometimes on each side of the sheet, using barcodes or other encoding, which can be used to determine the Ballot Style. The card_code might be directly used as the style_num, or there may be a conversion utilized. The card_code is generally not exposed to election staff that use the EMS to design the election, and is assigned to ballots by the EMS usually without the knowledge of the EMS user. These numbers are specific to a given county and different counties may use (unfortunately) the same card_codes to mean different styles.
Cast Vote Record (CVR)
A data file or set of files that provides the outcome of an election, typically broken down to the individual ballot. For ES&S, the CVR is typically a set of Excel spreadsheet files (.xlsx), where each record is an individual ballot sheet or one BMD ballot. Dominion in their more recent systems uses a variant of the NIST CVR "Common Data Format" and is a set of JSON files or sometimes CSV (comma separated values) files.
ES&S also uses this term for PDF files that may be provided where each is a written summary of the voting system evaluation of the vote on that ballot, and so we call these "CVR PDF files" while the spreadsheets are "CVR Spreadsheets". Dominion uses a similar record called the "AuditMark(tm)" which is combined with the ballot images as the third page of the TIFF image file.
There are several variants of CVR that are commonly seen:
- Summary, as Independent text or PDF, or image file: Sometimes, there is an independent text, PDF, or image file (.png or .tif) that provides a summary of the votes on that ballot. Typically, this summary includes only the options voted for, and whether votes were not cast for a contest, but it does not include options not voted for.
- "Flat" files: ES&S provides an Excel spreadsheet file (.xlsx format) which provides a separate line for each sheet. Dominion also has a similar format (.csv format) with one line for each sheet. The ES&S format provides a separate column for each voting opportunity. For example, if the contest is "vote for 3" contest, then there will be three columns. In each column, there will be either a candidate name, "undervote", "overvote", "write-in" or an image of the write-in area. The Dominion format provides a separate column for each option, with 0 or 1 in each cell. The ES&S spreadsheet files are limited to 99,999 records each, and so there may be multiple files in the set.
- "Nested" files: typically JSON or XML format. Used by Dominion, Hart and Clear Ballot, but each has its own variations on the theme. Generally there will be one block which represents the sheet, which then contains blocks representing each contest, and then blocks for each option voted for. It may have other information as well. This type of CVR may be a set of files that describe the contests, candidates, and other attributes. In some cases, each ballot image will have an associated nested file, and in others, they combine into fewer files, but perhaps still many thousands of chunks.
Certification (Election)
Refers to the official declaration by election officials that the election results are final and accurate. In some states, election audits are conducted prior to certification to allow any errors to be corrected. In other states, audits occur after certification, and there may or may not be any official mechanism to change the outcome. When using Cooperative Workflow, AuditEngine can most effectively conduct audits prior to the certification deadline because delays can be minimized. Once certified, candidates or campaigns typically have a short window to request a recount or file a judicial contest. The exact timelines vary by state.
cmpcvr
The name used by AuditEngine for the stage of a ballot image audit where the CVR is compared sheet-by-sheet and contest-by-contest with the tabulation created by AuditEngine. This stage creates a number of comparison records, where there is one record per ballot sheet (either Fully Agreed _or Partially Agreed), and one record per ballot-contest that is classified as a contest variant. Ballot-contests that are not considered variants do not have their own record but are included in their parent sheet record. The comparison records are then used to create the Discrepancy Report_, which is the primary output of AuditEngine.
Contest or Ballot-Contest
This is a term used in the comparison process. On each ballot cast, there are a number of contests. A single contest on a single cast ballot is a "ballot-contest" and sometimes just "contest" in the context of the comparison report. This is not the entire contest with all votes from all ballots summed, but simply the evaluation of that single contest on that sheet (for that single voter). Frequently, these can be called "votes".
Contest Name
A contest will have a name, which is used in the cast-vote record (CVR), and is preferrably unique. The contest name in the CVR may differ somewhat from what is actually printed on the ballot. But generally, the contest name on the ballot will be exactly the same on all styles, and it will have the same options (although they might be rotated and in a different order). There may also be write-in lines associated with the contest.
AuditEngine uses the names of contests as defined by the Cast Vote Record (CVR) file when one is available. The Election Information File (EIF) provides these names as used by the CVR and we stick with those when the CVR is available.
When AuditEngine is used in a state-wide application, counties should cooperate by using the same contest names for statewide or districts that span county boundaries and avoid using the same names when they are specific to one county.
Contest names should differ within the first 50 characters, and we prefer to have contest names differ in more than one character. For example: "Amendment I" and "Amendment II" are hard to tell apart, compared with "Amendment 1" and "Amendment 2" but it would be better to use something like "Amendment 1: Tax Increase" and "Amendment 2: Term Limits" so they differ by more than one character. Avoid long names like: U.S. Representative in 117th Congress From the 11th Congressional District of Georgia (Vote for One) (NP) and consider using a shorter representation like: U.S. House GA CD-11.
Also please avoid generic names that depend on the ballot style to have meaning. Instead of "Mayor" always add the town, like "Mayor, Town of Albion". True also for positions like Treasurer, Supervisor, Clerk, etc.
Contest Options
The list of voting opportunities in a contest are called Contest Options. The options can either be candidate names or Yes/No options for ballot measure type contests or approval contests (such as for judges). The options may include Write-Ins. The Contest Options are defined by the Election Information File (EIF) and are preferably the same as the strings used in the CVR. Each option on a Hand-Marked Paper Ballot will have a Target, such as an oval, which can be darkened using a pen by the voter.
Some states require that the order of the Contest Options will rotate to avoid bias. The Yes/No options are always in the same order.
Contest Rendition
A given graphical expression of a contest on a Hand-Marked Paper Ballot is called a Contest Rendition. Normally, contests are designed as a block with the Contest Name at the top, a possible description, followed by the Contest Options. Normally, such a contest rendition will be the same no matter where it is shown on the ballot, unless a different language or option rotation is used. The TargetMapper App will allow the user to link a given Contest Rendition to a Contest Name and Contest Options as defined by the CVR, and when linked, it will be found no matter where it is located on the ballot.
Contest Variant
A contest variant is a ballot-contest which has write-ins, overvotes, is flagged, or is disagreed, i.e. if there is any disagreement between the evaluation of the vote by AuditEngine and the official result. (Undervotes are not included in the set unless they are disagreed.) Contest Variants can normally be summed by contest. Each contest variant has a separate comparison record for each ballot-contest instance.
COTS -- Commercial Off the Shelf
This term is used if the equipment or software is not specifically designed for the election data processing, but is largely and more likely used for other purposes. As a result, there is a notion that these systems are also not as likely to have back-doors that can be utilized to maliciously attack the results of the election. For AuditEngine, we commonly specify COTS scanners for the Verification Phase.
Cooperative Workflow
When working with election districts under contract, turnaround delay can be minimized using a cooperative workflow. The main feature of this workflow is that
a) data is available BEFORE election certification, and
b) preliminary election design data (which does not include any live ballots) is provided to AuditEngine prior to election day.
This preliminary data is used to get a jump on the configuration required so no configuration changes are needed to process the live election data.
The components of the Cooperative Workflow include the following:
- We request that election districts take care with Ballot layouts. They should:
- use the same Contest Names and Option Names for contests that are common to multiple counties and different contest names for contests that are unique to any given county.
- include the county and election name in a standard location to allow ballot images from other counties or from other elections to be detected.
- NOT use colors that will drop out in the scans.
- Ballot Image Files should be exported without "COPY" or other watermarks. See the Exporting Guide for instructions on how to export each file.
- Each county will prepare a Hash Manifest File with hash values of each file. We suggest the use of the QuickHash Windows app.
- All data must be directly uploaded to AuditEngine from each county. We can provide an upload link specific to each county which will remain constant from election to election.
- Ballot Style Masters (BSMs) are uploaded as soon as they become available. This should be very early in the cycle.
- Data from the _Logic and Accuracy Test (LAT) _in the form of ballot image archives (BIA) and LAT CVR plus are uploaded to AuditEngine as soon as these become available, and before election day to allow for configuration of AuditEngine for each county.
AuditEngine can then be configured, including the completion of the Target Map prior to the election using the LAT data. As soon as ballot images and the CVR are completed in the real election, these files are uploaded. The mapping phase is skipped in the real election audit and the Target Map is imported from the LAT audit. This results in a quick-turnaround of the audit results.
This is in contrast with the Public Oversight Workflow which occurs after Certification with no additional work by election districts other than providing data, and therefore is not turnaround-time optimized.
See also "Workflow"
CSV File
A CSV (Character Separated Values) file is the most widely used format to express tabular data, and is frequently used by AuditEngine as the result and inputs to various Stages in the processing Pipeline. AuditEngine uses the most standard format, which uses a comma between data fields and may include double quotes in the field if there are embedded commas. These files can be read by most spreadsheet programs. A CSV file may also include JSON embedded in a given field if the field contains either a list or dictionary.
All metadata files and votes from AuditEngine are resolved and flattened to CSV File formats.
Discrepancy Report
This is the most important report from AuditEngine, which is the result of comparing the votes extracted by AuditEngine with those extracted by the voting system in the form of the CVR. This is a lengthy automatically generated report and includes the following:
- Introduction to make sure the reader understands our terminology.
- Metadata Summary: This metadata summary includes comparison counts between the CVR, Images and Cast.
- Summary of Discrepancy Records:
- High-Level Reconciliation by sheets and by contests, including pie charts.
- Audit-Engine Flagged Report, by sheet and by contests, including pie charts.
- Contest Variants Breakdown, by sheet and by contests, including pie charts.
- Normal Disagreed (No write-ins or overvotes) by sheet and by contests, including pie charts.
- Non-additive Groups - including Contest Variants, Disagreed, Ballot Variants, uncategorized (should be 0) and Blank sheets.
- Ballot Variants
- Detailed groups
- Write-ins Detailed, by sheet and by contests, including pie charts. Includes both agreed and disagreed writeins. Please note that AuditEngine does not review detailed write-in information as this is normally done extensively by the election office using human eye.
- Overvotes detailed, by sheet and by contests, including pie Charts, agreed and disagreed.
- Gray Flagged agreed votes.
- Relevant Settings.
- Contest Discrepancy Table. Each contest is summarized as one line in a table, with the following fields:
- Total: Total ballots cast which included this contest with images that were processed by AuditEngine.
- Non-Variant: Ballots with this contest where the official outcome and the evaluation by AuditEngine agreed, and did not include write-ins, overvotes, and were not flagged as 'gray'.
- Agreed Overvotes: Ballots where both AuditEngine and the voting system detected an overvote.
- Agreed Write-ins: Ballots which included write-ins in terms of a marked target, but where the name written-in may not be a qualified write-in candidate, and the write-in may be correctly attributed as a vote for a listed candidate.
- Agreed Undervotes: Undervotes are very numerous and we do not break those down here, and are not included in the discrepancy report unless they are disagreed or gray flagged.
- Disagreed: These are ballot-contests which were not initially evaluated as overvotes or write-ins, and where the evaluation by AuditEngine disagrees with the voting system.
- Gray Only: Ballot-contests where AuditEngine detected an ambiguous mark on this contest or used heuristics to decide voter-intent. This column omits any ballots which are in the columns for write-ins, overvotes, or disagreed ballots, even if AuditEngine internally flags them as gray.
- All Variants: Ballots with Agreed Overvotes, Agreed Write-ins, Disagreed, or Gray Flagged. The number of ballots cast should equal the sum of "Plain Agreed", and "All Variants". The components of this column are highlighted.
- Disagreed% of Margin This provides a good measure of whether the variants may have any impact on the outcome, and the highest five values are highlighted. Further analysis is still required to see if the disagreements will reduce the current margin of victory.
- Variant% of Margin This provides a maximum measure of whether the variants may have any impact on the outcome, and the highest five values are highlighted. Typically, the vast majority (perhaps 90%) of All Variants are Agreed Write-ins and Agreed Overvotes which may only rarely result in any changes in the outcome.
- Vote Margin: This is the margin of victory, i.e. gap between votes for runner-up and winner (lowest winner if contest has multiple winners) among the ballots processed, and may be a subset of the total margin for the entire district if AuditEngine did not receive or process all ballot images.
- Contest Details: Each contest is then reviewed in detail. To limit the size of the report, contests are only detailed if they are one of the first 10 contests, the closest 5 contests, or the top 5 contests with the most variants. Also, any contest of interest can be reviewed in detail.
- CVR results for this contest.
- Summary of the comparison results for this contest (same as the line in the Contest Discrepancy Report)
- Disagreed ballots by group, detailed by record types. These are summary tables for each record type. Click on group designation and it will go to the individual records.
- Normal Disagreed
- Write-ins
- Overvotes
- Gray-Flagged
- Individual records for each discrepancy. This is a lengthy section which shows the discrepancy record followed by the image of the ballot, front and back. Click on the thumbnail and the full size image is displayed in another window.
- Precinct Report Summary Table -- Each precinct is summarized as a single line in a table. Columns in this table are similar to the contests table.
- Precinct Details -- Precincts are detailed if they have the highest Disagreed% of Total or highest Variant% of Total. For each precinct, they are first broken down by group, then detailed to the ballot. Ballots are shown as thumbnails and can be viewed in full resolution by clicking.
Direct Recording Electronic (DRE) voting machines
After the Help America Vote Act (HAVA) was passed in response to the year 2000 election debacle, districts began to adopt fully Direct Recording Electronic voting machines which recorded the vote to internal memory only. Because these systems had no paper trail, the vote could be altered or lost and there was no way to check it. In response to this problem, these machines were retrofitted with a Voter-Verifiable Paper Audit Trail (VVPAT) device, which recorded the votes onto a paper tape that the voter could review. These voting systems are now being retired for hand-marked or BMD voting systems.
Disagreed Contest
As used in the comparison process, a contest is considered "disagreed" unless the voting system (from the CVR) and AuditEngine fully agree on the outcome, including whether it was overvoted, undervoted, or had write-ins. Since write-ins and overvotes are frequently reviewed and adjudicated by election staff, disagreed write-ins and overvotes are treated separately by the AuditEngine analysis. "Normal Disagreed" are the rest of the disagreed contests, which do not include write-ins or overvotes, but instead provide the actual difference in the evaluation of the vote cast by AuditEngine vs. the voting system.
District Record
For each election processed by AuditEngine, there will be a District Record defined, which will provided the details about the given district. Districts are usually counties, but not always. Once the district is set up, then an Election Record can be defined for each election processed in this district. Finally, one or more Audit Jobs can be defined and run. These audit jobs can be combined in a single Project.
The District Record should use the same naming conventions, but it will not have the date of the election, for example.
CC_SS_District
CC
is the Country Code, (can be omitted if 'US') Full ListSS
is the two-character state code. Full ListDistrict
-- is the District name, typically a County, State, City, or other jurisdiction.
GA_Bartow, CA_SanDiego, GA_DeKalb, FL_MiamiDade
See also Election, Audit Job, Job Name
Dominion Voting Systems (Dominion)
A major voting system vendor with approximately 37% of the market.
Duplicated / Remade / Rescanned Ballot
Multiple ballot images that are identical to the human eye or are equivalent representations but are digitally different, and will likely have different _ballot_id_s as well. These occur when:
- The original ballot is damaged and will not scan properly
- The original ballot is in a format or language which is not directly supported by the configuration of the voting system.
- Ballots are rescanned, either intentionally or unintentionally, and combined with other images in the set for those same ballots.
For the first two causes, election districts may enter ballots into BMD to create a Ballot Summary Card, or carefully transpose the marks to a new Hand-Marked Paper Ballot.
Election Information File (EIF)
This is a required file that defines many aspects of a specific election in a specific district, including the contests, contest options, write-ins, etc. A draft EIF file is generated from the CVR if it exists. It includes the following fields for each contest in the election.
- official_contest_name - should match the string used in the CVR if it is available. See Contest Name
- vote_for - the maximum number of votes in this contest, default: 1. See Overvote and Undervote
- writein_num - the number of write-in options provided.
- official_options - list of candidates or yes/no options, not necessarily in the order on the ballot. These strings should match those used in the CVR exactly.
- bmd_contest_name - contest names as found on BMD ballots. This may be the same as the official_contest_name but we must have the exact string to ensure accurate Optical Character Recognition (OCR) of BMD ballot selections.
- qualified_writeins - list of qualified write-ins for this contest
This file should be reviewed by the auditing team, particularly to adjust the 'bmd_contest_name's and 'writein_num' to verify that these accurately match the layouts. Note: Other fields exist but are not normally used because most of the definition is not required since we use the TargetMapper App to generate the map instead of doing image analysis.
Election Record
Within the AuditEngine app, within a given District, there may be multiple elections defined. The "Election Record" is the information related to that election, and will eventually include files uploaded appropriate for that election and used by one or more [Audit Jobs][#audit-job].
In AuditEngine, we have established a standard naming convention for elections, as follows:
CC_SS_District_YYYYMMDD
Where:
- CC -- is the standard two-character country code, typically always US (United States of America or if US, then sometimes left out). Full List
- SS -- is the standard two-character State code, such as AZ (Arizona), CA (California), and FL (Florida) Full List
- District -- is the District name, typically a County, but sometimes a State (Alaska).
- YYYYMMDD -- is the date of the election, such as 20241105.
For example, US_GA_Bartow_20241105
See also District Audit Job, Job Name
Election Management System (EMS)
This is the general term for the suite of software applications that provide all the functions needed to assist election officials to perform elections. Functions include the definition of ballots, configuring voting machines that are used in polling locations, central tabulation, and reporting.
Election Systems & Software (ES&S)
A major voting system vendor with approximately 47% of the market.
ETag
Short for "Entity Tag", this is a string of hexadecimal digits (usually MD5 format) and possibly a hyphen followed by decimal digits. These are used by cloud storage to allow detection of changes in the files uploaded. AuditEngine can calculate the ETags of local files to know if uploading or downloading is necessary. These are used by the AuditEngine Pipeline to know whether a stage needs to be re-run, because the dependencies have changed. See also MD5, Pipeline, SHA1 / SHA256 / SHA512, Stage
An example ETag as used by AWS S3: 6cf81d4ec591b351adbfb33ed5861b6f-228 It includes a first part, which is the MD5 checksum over a list of MD5 checksums of smaller chunks of data. In this case, there are 228 chunks. As a user of these values, the only thing you can determine is whether they differ.
Evaluation Heuristics
AuditEngine first recognizes the marks using the Adaptive Thresholding methodology which employs a number of algorithms to determine the optimal thresholds that are used for each ballot. Once the marks are determined, then there is a stage of evaluating the marks to determine the votes in the contest. If the correct number of marks are detected for the contest, then the marks are directly converted to votes. If there are too many marks, then the contest may be evaluated as an overvote and if there are too few, it may be considered an undervote. A set of evaluation heuristics are used to attempt to resolve an overvote by one vote and a single undervote. If the overvote is only by one vote, then the marks are evaluated to see if there is a darker scratch-out by the voter, or a lighter hesitation mark. If it is a single undervote, then the algorithm looks for possible light votes. If any attempts are made to resolve these by AuditEngine, the contest is Flagged.
When compared with voting systems without adjudication, AuditEngine resolves 75% to 95% of Normal Disagreed Contests when the image has no corruption, and there are no write-ins.
extractvote
This is the primary stage in the audit process where all ballot sheets are individually processed to recognize either the marks made by the voter, if it is a hand-marked paper ballot format, or to perform Optical Character Recognition (OCR) on the text summary on the ballot summary cards produced by BMDs. The extractvote processing is delegated to a fleet of individual compute instances in our datacenter and we are authorized to run up to 10,000 compute instances. See also Fleet, Lambda.
Flagged
A contest is flagged by AuditEngine if they include write-ins, are "Flagged" as ambiguous, that is, if AuditEngine used heuristics to guess on the correct resolution of hesitation marks and cross-outs, in the case of overvotes, or if the ballot images were unprocessable due to damage to the image. AuditEngine can be used to produce its own canvass of the election, and the Flagged contests can be reviewed using the AuditEngine AdjudiTallyApp when the margin of victory is close to resolve the flagged contests using human eye evaluation of the images.
Fleet
The term "fleet" is used to refer to the set of compute instances (AKA computers) that are used by AuditEngine in certain stages that can be expedited by using parallel processing. AuditEngine currently uses AWS Lambda compute instances and we are authorized to use up to 10,000 at a time, but may use only 2,500 instances in parallel and then in rounds, for a given stage.
Because we are limited to the overall number of instances used at any time, we use a reservation system during critical post-election times when many election districts are conducting audits simultaneously. Each delegation to the fleet typically takes less than 15 minutes, unless we are processing a very large number of sheets. Once one user has released their reservation then another delegation to the fleet can occur.
Full Hand Count Audit (FHCA)
This auditing method simply hand-counts all ballots and creates a total of all ballots cast. It differs from a traditional recount and is considered an "audit" if there are additional controls to limit errors and incremental comparisons on batch or precinct basis, to locate any machine errors. FHCAs are easier to conduct than RLA audits because there are no statistics required and auditors can make corrections to the results as they go. In contrast, audits that rely on samples (and not all ballots), must not correct the samples as they go, and this is nearly impossible for workers to resist. However, RLA audits, if conducted well, may be able to process a very small sample of ballots if the margins are over 2% to 5%.
Fully Agreed Sheets
In the comparison process, a ballot sheet is classified as Fully Agreed if it has no write-ins, overvotes, or gray-flags, and every contest fully agrees between the evaluation of the voting system and AuditEngine. Such sheets are maintained as a single record. See Partially Agreed Sheets.
gentemplates
This is an important stage in the AuditEngine processing pipeline where individual ballot images of each sheet and each style are combined to create a standard style template of each style. Each sheet has its own style. See also Ballot Style, and card_code.
Hand-Marked Paper Ballot
A ballot format that provides all the contests and options available for voting by a specific voter on a set of ballot sheets, with targets that can be marked by the voter using a conventional pen. This type of voting is not directly usable by voters who are blind or cannot operate a pen. Federal law currently mandates that each polling place have machines that are compatible with the needs of voters with disabilities. Some BMD devices produce a ballot format identical hand-marked paper ballot even though they are machine marked. These then can be completed by some voters with disabilities using the BMD interface or assistive devices. Also known as nonBMD ballots.
Hart Intercivic (Hart)
A voting system vendor with a relatively small footprint nationally and produces ballot images. The Hart BMD format is the same as a hand-marked paper ballot format, so it is not easy to tell that a BMD device was used, they are completely verifiable, and they can be processed without using Optical Character Recognition (OCR).
Hash Value / Hash Manifest File
A hash value is a fixed-length "fingerprint" of a block of data which is
-
relatively easy to calculate,
-
will change substantially if even one bit is changed, and
-
is infeasible to predict.
These are typically calculated over a given file. It is infeasible to alter a given file to produce another valid file and also produce the same hash. The receiver of any file can calculate the hash values and compare it with the hash value in the Hash Manifest File to verify that the files are unchanged.
A single "Hash Manifest File" can be prepared that includes the file name and a hash value for each file provided in the export of the official results. There are free applications that will prepare a Hash_ Manifest File_ of any given folder.
We recommend the free Windows Application QuickHash to prepare a Hash Manifest File for all the files produced as the result of an election.
Hexstyle
Each vendor and each county can use an arbitrary method for identifying styles, typically by number. The number is just a label, and does not provide information about what contests exist on the ballot. Therefore, we have defined a method of describing the style, essentially by listing the contests that exist on that ballot. We call this the hexstyle, even though the form of the indicator may not use hex digits.
TYPE 1: hexidecimal encoded bitmap:
This is a style indicator, originally defined by AuditEngine, to represent the contests included on a ballot. It is represented as a hexadecimal number written with the characters 0-9 and a-f, where each character represents a 4-bit sequence, in the same order as defined in the EIF. In the bitfield, each bit in the sequence is 1 if the contest exists on the ballot of that style, and 0 if it does not. The sequence is left justified, and padded with 0's on the right. So the hexsyle value 0xc003
indicates that the first two contests and the last two contests are on the ballot, and the middle 12 contests are not. The entire binary sequence in this example is 1100 0000 0000 0011
.The characters "0x" indicate that it is a hexadecimal number, and "c" indicates the bit sequence 1100, while 3 indicates 0011.
The hexstyle is a shorthand way to represent the contests on the ballot, which is a primary determinant of style. It does not handle differences in style due to language, or option rotation. There should be only one hexstyle per style_num, but more than one style_num may have the same hexstyle.
The following table provides the hexadecimal characters for each bit sequence. This is the standard definition.
0 - 0000 | 1 - 0001 | 2 - 0010 | 3 - 0011 |
4 - 0100 | 5 - 0101 | 6 - 0110 | 7 - 0111 |
8 - 1000 | 9 - 1001 | a - 1010 | b - 1011 |
c - 1100 | d - 1101 | e - 1110 | f - 1111 |
TYPE 2: Contest Numbers with run-length
The form of the hexstyle has changed due to the extremely long hex sequences that may result when there are a large number of contests. If there are more than 160 contests (which would be more than 40 hexidecimal digits), or optionally always, a run-length encoded hexstyle is used, of the following form:
<(contest_idx)[xrun_length]|(contest_idx)[xrun_length]|...>
like: <14x4|20|23|25x3|745|810x2>
specifies the following contests:
14, 15, 16, 17, 20, 23, 25, 26, 27, 745, 810, 811
Example: consider the hexstyle in bit-mapped form: 0xc003 has contests in locations 0, 1, 14, 15
runlength form:
<0x2|14x2>
or <0|1|14|15>
Thus, for very few contests, the Type 1 hex mapping is the most economical, but becomes unwieldy if there are many contests defined when only a few are included on any ballot sheet, and then Type 2 is essential. Using Type 2 is a little bit less economical for very few contests but is still quite reasonable to use at all times. Therefore, Type 1 may be deprecated in the future.
Images Missing
This is a discrepancy attribute. Sometimes not all the images are provided and so these are accounted for as the number of Images Missing. AuditEngine does not attempt to sum the number of contests nor the votes on ballot images that are missing. If there are a vast number of images missing, this can make an accurate audit difficult.
Job Name
This is the name of the AuditEngine job to differentiate it from other jobs, and determines where generated files are stored. The format of the Job Name has been standardized internal to AuditEngine for the Election, such as ST_County_YYYYMMDD, where ST is the two-character state abbreviation, County is the County Name without spaces or special characters, and YYYYMMDD is the the data of the election when the polls closed. It may also have the two-digit country code as a preface, such as "US_"
Thus, GA_Bartow_20201103 is the job for the 2020 General Election in Bartow County, GA. It is okay to add additional tags to the end, like _LAT if it is the job to process the Logic and Accuracy Test ballot images in that same election, or for other purposes. NOTE: The job_name can't be changed very easily once it is set. It cannot have spaces or special characters other than underscore. See also Audit Job, Election, and District. Sometimes, jobs may be created from other jobs using the 'clone_job' function. The name of the cloned job must have a similar name, with the same name given to the Election, and then a suffix modifier, like _clone, _functest, etc. For example, GA_Bartow_20201103_clone.
See also: District, Election, Audit Job
Job Settings
For each Audit Job, there are a set of related Job Settings. The file of job settings for a given job will be in a file with name "JOB_" followed by the job_name. It is a csv (comma separated values, i.e. spreadsheet) file and can be edited with any spreadsheet program, but is normally edited through the AuditEngine App. There are a vast number of possible settings that can control AuditEngine when it processes an election. Most of these are to allow AuditEngine to handle ballots from various vendors, and variations we find due to differences in how the Voting Systems are programmed. We can summarize these settings into a number of categories:
- Source Files -- The locations and names of Ballot Image Archives, Verification Archives, CVR files, BSM Files, and other exports from the EMS. These parameters are provided by the AuditEngine App derived from the files that are uploaded.
-
Election Info -- Information about the election district used for reporting, such as the official ballots cast, population, registered voters, registration partisan bias, etc.
-
Layout Info -- Regions or adjustments to regions where information can be found on the ballot layout, particularly for Hand-Marked Paper Ballot layouts, such as to read the Ballot Style, pstyle, pcounty, Write-in Area adjustments, etc.
-
File details -- Information to allow metadata to be extracted from the Archives and CVR files.
-
Execution and Reporting Controls -- Controls used to select specific precincts, styles, groups, contests, etc. to limit execution and reporting.
JSON
JSON is an acronym that stands for "Javascript Object Notation". Although this was originally defined for use in the programming language Javascript, it is now the most widely used export format that can express relatively complex and nested data structures, and has supplanted XML in popularity. AuditEngine can import the JSON CVR Export and it also produces files in this format as the result of various stages.
JSON CVR Export
We use this term to refer currently only to the Dominion CVR Export that provides records in JSON format, roughly following the NIST Common Data Format CVR standard.
Lambda
AuditEngine is currently deployed to Amazon Web Services (AWS) cloud, and therefore our parallel processing Fleet uses their Lambda service. One Lambda is one compute instance in the Fleet. See Fleet.
List of Registered Voters
The published list of voters who are registered as eligible to vote in the election, including persons who are on the "list of inactive voters" is the List of Registered Voters. This list as published should not include any personal identifying information and may be limited to no more than the voter's name, year of birth, and street address in each record, even though the official record for each voter may include additional information. This list does not include protected voters, but the number of protected voters that are registered should be provided without listing them. This list should be published to include all voters that were registered as of election day, and include those who registered on election day, and not be further updated (with later registrations) so as to allow for accurate comparisons.
List of Voters Who Voted
This list provides the list of voters that either:
- returned an absentee or mail ballot which was accepted for tabulation, or
- checked in at a polling place, even if they left without casting their ballot.
This list as published should not include any personal identifying information and may be limited to no more than the voter's name, year of birth, street address in each record, even though the official record for each voter may include additional information. This list does not include protected voters, but the number of protected voters who voted should be provided without listing them. For this list to be useful, it must be fully updated, including all voters who voted. It is unfortunately a common practice to not fully update this list as it is commonly used by campaigns in get-out-the-vote (GOTV) efforts and some jurisdictions may typically only update it to include those voters who voted very early but not include the last day or two.
Logic and Accuracy Test (LAT)
Voting systems typically are required to complete a "Logic and Accuracy Test" (LAT) by state law, where the voting system is configured and it processes test ballots to check that the mapping of Targets on Hand-Marked Paper Ballots and BMD ballot summary cards are correctly linked to the contest and options as reported in cast vote records (CVR). Essentially, the LAT answers the question:
Does the voting system (hardware and software) read and tabulate the marks on a ballot or touches on the screen with 100% accuracy?
The test ballots are usually marked uniformly and not with light marks, circles, checkmarks next to the oval, etc. and to fully test the software — NOT to simulate an election. The LAT should also include BMD ballots.
AuditEngine can audit the LAT ballot images by processing them as if they were from an election, and then comparing the ballots with the known good CVR results. This can test the configuration of AuditEngine, and as a side benefit, evaluating the coverage of the LAT test ballots.
AuditEngine can use the LAT ballot images and LAT CVR to create the Target Map so the actual election data can be processed with faster turnaround. In this mode, it is best if the LAT CVR includes the "Ballot Style" field (ES&S) and for Dominion, the JSON CVR export should be used.
Note: For a general treatment of the subject of the LAT test ballots, see Guidelines for Creating a Deck of Test Ballots (John Washburn)
Mapping
One key Phase in the process of performing a ballot image audit is the mapping of the contests and options to specific target locations on a paper ballot of a specified style, resulting in the Target Map. This Phase includes a number of stages for setting up, processing, and then importing the results from the TargetMapper App.
MD5
A popular hash algorithm used to detect changes in files due to uploading or downloading errors. It is not strong enough to be used for critical cryptographic purposes, but it continues to be used by cloud storage services in ETags and is generally regarded as "good enough" for those purposes due to the restrictions on file structure. Other stronger algorithms are recommended for critical cryptographic purposes, such as SHA1 / SHA256 / SHA512. We suggest using SHA512 for the Hash Manifest because it is available and is extremely strong, but cloud storage services continue to use MD5.
An example of an MD5 checksum: f664b587aacae05c8aa5c591b8659ec4
Metadata
Essentially "data about other data", and used to refer to attributes of ballot images and cast vote records that are not votes, such as ballot_id, precinct, batch, group (election day, early, mail, etc), sheet, style, file size, Archive / Zip Archive location, etc. Many consistency checks are available by reconciling the metadata from the various sources. See also "BIF".
Modified Record
See 'Adjudicated'.
Nonvariant Ballot
A nonvariant ballot is neither a Ballot Variant, and it has no Contest Variants. In essence, a nonvarant ballot is 100% in agreement with the evaluation by AuditEngine, and also has no write-ins, overvotes, or gray-flags. However, it might have agreed undervotes.
Optical Character Recognition (OCR)
This is the process used by computer software to convert printed text that is scanned as an image into character codes for each character. AuditEngine uses this process to convert the text printed on BMD ballot summary cards into text to avoid relying on barcodes.
Overvote
If a voter marks more than the number of options allowed in the contest, it is considered "overvoted" and no vote is awarded to any option. For example, if the "vote-for" number is 1 and the voter votes for three options, it is considered one overvote.
Overvotes do not occur on BMD ballots.
Very frequently, overvotes are misinterpreted by the voting system and should be fully reviewed in close contests. If a contest has an overvote, it will be first classified as an overvote, even when the write-in target is selected. Some election departments will mark overvotes as an undervote when adjudicated if they are confirmed as an overvote. AuditEngine understands this form of adjudication and does not regard it as a disagreement.
Page
One side of a sheet, if sheets are printed on both sides. If a ballot provided to a single voter has 2 sheets, then the pages are numbered 1, 2, 3, 4, for the front of sheet 1, back of sheet 1, front of sheet 2, back of sheet 2. Sometimes pages and sheets are numbered starting at 0, so we have to be careful. If we know the page or sheet will start at 0, we will call it page0 or sheet0.
Partially Agreed Sheets
In the cmpcvr comparison process, if a ballot sheet has at least one contest-variant, then that contest-variant is logically pulled from the partially agreed sheet comparison record, and what remains in the partially agreed sheet record are the rest of the non-variant contests. Note that if all contests are considered contest-variants, then the partially agreed sheet will persist as an empty shell with no contests left in it, and all contest variants will be moved to the contest-variants set.
Personal Identifying Information (PII)
Information normally regarded as personal and private. For elections, it typically includes a person's month and day of birth, driver license number, non-operating license number, social security number or portion of that number, Indian census number, father's name, mother's maiden name, and state and country of birth, email address, or the record(s) of that person's signatures.
Please note that extraneous marks on a ballot are generally not regarded as PII, even if the voter includes their initials or signature on the ballot, because the ballot is anonymous and cannot be linked to any specific voter.
Phase
See Audit Phase
Pipeline
The operation of AuditEngine is divided into a series of stages, where each stage has defined inputs (dependencies) and a number of output files. The set of the stages together is called the pipeline. Each stage cannot be executed unless its dependencies are available from prior stages.
Poll Tapes / Digital Poll Tapes / Poll Tapes Audit
Voting machines that are used in polling places typically have a poll-tape report which is printed out by poll workers during an election, and frequently is posted at the polling site with a signed copy turned into the election office. Digital Poll Tapes can be produced by ES&S voting systems. These are an exact copy of the report produced by the voting system scanner but they are provided as a PDF, LST or TXT files. These files can be processed by AuditEngine to provide another check of the results of the election by comparing the aggregated results for each precinct with the final aggregated report. This audit is not available for Dominion systems.
Protected Voters
These are voters that are not included in the published List of Registered Voters or in the published List of Voters who Voted because they are protected by statute or enrolled in an address confidentiality program, typically because the person reasonably believes that their life or safety of that or another person is in danger and restricting access will serve to reduce that danger. Although these voters are removed from the other lists, it is helpful to know both the number and precinct number of protected voters who are registered and the number of protected voters who voted without having the exact list, to facilitate comparing the List of Voters who Voted with the Number of Ballots Cast.
Public Oversight Workflow
AuditEngine has been designed to accommodate a "Public Oversight" workflow, where members of the public, candidates, or campaigns can request ballot image and CVR data and conduct audits using AuditEngine independently from the election office. This will likely happen in a slower time frame, because these data are available and the audits are commonly only activated after the election is completed and when there is some question of the outcome. This is in contrast to the "Cooperative Workflow", where the election districts are providing data prior to the election to expedite audit turnaround. See Workflow.
QR Code
A 2-dimensional machine-readable graphical artifact consisting of black squares arranged in a square grid on a white background, including some fiducial markers, which can be read by an imaging device such as a camera. They can, in general, be of various sizes and most smartphones have reading ability built into the camera.
QR Codes are used on BMD ballot summary cards produced by Dominion and by the VSAP (Voting System for All People) developed by Los Angeles County, CA. The QR Codes used in the VSAP system can be easily read by a smartphone and the codes compared with annotations on the ballot. The QR Codes used by Dominion are, however, binary in nature and are not readable by a smartphone camera, and are thus not verifiable in the same way that the VSAP codes are.
AuditEngine does not use QR Codes to determine the votes on a BMD ballot summary card, but rather reads the printed strings using Optical Character Recognition (OCR). Sometimes, the term barcode is used as a generic term to mean any type of linear or 2-d code, such as a QR Code.
Redline Proofs
A critical step in the operation of AuditEngine is the creation of the Target Map using our TargetMapper App. To check the consistency of this map, we have a consistency check when the Target Map is imported, and AuditEngine creates a full set of "Redline Proofs" which are ballot styles with red outlines and printing on them that shows the location of each Target and the associated CVR Contest Name. These proofs can be reviewed by human-eye to find inconsistencies. Also shown are Write-in Areas.
Repeated Ballot Images
AuditEngine can handle repeated ballot images in Ballot Image Archive / Zip Files. Repeated ballot images can occur if the same ballots are included in two different Ballot Image Archive / Zip Files or if they are included in the same Archive but using different path names. These are detected and moved to a list of "skipped" repeated ballots, while the other ones are marked as just "repeated". These are detected only if the images have the same ballot_id, but they may have a different full path name within the archive.
We must distinguish these from Duplicated / Remade Ballots, which may result in multiple ballot images that are identical to the human eye but are digitally different, and will likely have different _ballot_id_s as well. Commonly, election districts may enter ballots into BMD to create a Ballot Summary Card, or carefully transpose the marks to a new Hand-Marked Paper Ballot.
Risk-Limiting Audit (RLA)
A method of checking that the outcome of an election is accurate by randomly selecting ballots or batches of ballots, and continuing to select samples so that the risk that the outcome (as determined by a full-hand count) could be statistically different, is less than a given risk limit. There are four types based on how the samples are taken (by ballot or by batch) and how those samples are compared (by margin or by sample), resulting in ballot-comparison, ballot-polling, batch-comparison, and batch-polling. RLA audits are difficult to apply when there are many small contests and when margins get tight, and will miss issues that involve just a few ballots. Most RLA audits attempt to audit only one or a few contests. In contrast BIAs (Ballot Image Audits) such as those conducted using AuditEngine can audit all contests and can detect issues down to the specific ballot, are not limited by hand-counting error rates, can be conducted more independently, and are not haunted by election workers fixing up the samples for a clean audit.
Proponents of RLAs point out that BIAs are not looking directly at paper ballots and there is a small chance that the ballot images might be hacked or incorrectly generated. AuditEngine offers the Verification Phase where batches of paper ballots can be scanned using a scanner that is independent from the voting system to detect image manipulation. BIAs can be easily used in conjunction with RLA or full-hand count audits.
S3
This stands for Amazon "Simple Storage Service" but it is normally only called just AWS S3 or just S3. This is the secure storage service used by AuditEngine. This service does not allow alteration of files after they are written, and the timestamps cannot be altered. Also, our fleets of compute instances must have the data in S3 in the same datacenter, and produce their results also to S3. AuditEngine, for security, does not use a database service in the processing of election data, although database servers are used for the purposes of controlling the AuditEngine App and helper Apps maintain state.
SHA1 / SHA256 / SHA512
Stronger hash algorithms that result in longer hash values, and are approved for cryptographic purposes. If available, we recommend the use of SHA512 in Hash Manifest files. See also MD5 and ETag.
There is much confusion about "SHA" files that may be included with other files provided by the voting system. These files are useful only to detect error in the associated file, and they will not guard against alteration of that file, because it is easy to generate the associated SHA file. However, once the SHA files are published, then the associated files cannot be changed without the possibility of detection. It would be useful for voting systems to create and post these files early-on, even before the images are available, so the files cannot later be modified.
Sheet
Each ballot image is of a single paper sheet, including both sides. This is the case even if multiple sheets are included in the logical ballot.
In the comparison process, the number of sheets shown in tables is not necessarily a direct sum because a single sheet may include multiple contest variants. Also, depending on the number of sheets included in a single logical ballot, the number of ballots cast may be less than the total number of sheets. This discrepancy may be hard to predict because subsequent sheets are commonly not included in the logical ballot cast. Counting the first sheet will usually produce a pretty accurate number, but even then, some voters do not include the first sheet.
To complicate matters further, BMD ballots may combine multiple sheets onto one ballot summary card, and thus may result in only one ballot image file even when there are multiple sheets involved in Hand-Marked Paper Ballots.
Stage
One set of operations that are logically separated in AuditEngine, with specified input files (dependencies) and output files that result. Each stage commonly also produces reports of the results of those operations. The stages are organized into a pipeline and are executed sequentially until the pipeline is completed or a stage does not have all the inputs available. Sets of stages are also combined logically into phases, for the purposes of explaining their functionality.
Target
Targets are ovals or graphic elements that are marked by the voter using a pen or other writing utensil to indicate a vote.
Target Map
A configuration file which is the result of the TargetMapper App which associates the location of targets on hand-marked paper ballots with the contest and option as defined by the voting system and exists in the CVR. A given Target Map is unique to a given Election in a specific District, but it can be shared by different Audit Jobs by adopting the Target Map of an Audit Job in the same Election.
TargetMapper App
A browser-based application that assists with the mapping of the contests and options as defined by the CVR and the targets on hand-marked paper ballots. The result of the TargetMapper App is the Target Map. See details here: TargetMapper App
Undervote
An Undervote occurs when a voter does not vote for as many options as is possible in a contest. The number of undervotes is the number that is not voted. So if a voter can vote for up to 3 in a contest, and only one option is selected, then the number of undervotes in that contest is 2. AuditEngine will correctly interpret many marks that would be considered undervotes by voting systems, such as when ovals are circled or if a checkmark does not actually get inside the oval, or very light marks.
Unprocessed Sheets
Sheets that were NOT successfully processed by AuditEngine, generally due to images that are damaged due to scanner jitter when the sheet is not evenly fed through the scanner, barcodes that were corrupted and unreliable, or BMD ballots that were not perfectly read by AuditEngine using OCR (optical character recognition). Cleaning the scanner rollers can help to avoid corrupted images. AuditEngine does not track the number of contests on unprocessed sheets. If a large number of ballots are unprocessed from a specific voting machine scanner, then additional maintenance or retiring that scanner may be called for. Ballots that were Unprocessed were not necessarily improperly evaluated by the voting system.
User Permissions
AuditEngine has a set of user permissions to regulate the activities of users on the site. These can be summarized as follows, from lowest permission to highest:
Role | Permissions Granted |
---|---|
Guest | This is the default role and has no permissions. You will need to request additional permissions for activities on the site. |
Uploader | An uploader has no permissions except to upload files to a given district. The files that are uploaded are placed in an initial bucket until they are reviewed by a "User" or above and adopted to the election. |
Observer | This role can watch specific audits, and review the results. |
User | This is the general purpose level that can do most activities, but can't run expensive stages. |
Auditor | Same as User but can run expensive stages as well. |
Project Manager | Can also Manage and edit projects. A project is a set of audit jobs say in an entire state. |
Admin | The administrator has authority also to change the role of users. |
There are also audit-specific permissions for running TargetMapper and AdjudiTally
Verification Images, Verification Phase
AuditEngine has an optional capability to process ballot images that are made using non-voting system scanners. We call these "Verification Images", and the workflow phase is the "Verification Phase".
Verification images should be of precincts (or batches) that correspond to an aggregated group in the CVR or summary report. These are not run as a separate audit, but as an additional 5th phase in the pipeline of the subject audit. If the Paper-Centric workflow is used, then these can be considered to be of primary concern rather than as an optional add-on.
In all cases, non-voting system scanners are used to capture the images of the paper ballots, sometimes called electronic copies.
The number of batches processed in this way can be minimized, because the primary hazard being tested is the remote chance that the ballot images were manipulated prior to being secured by the system after scanning. Therefore, the number of batches can be reduced to a fixed value.
Typically, this phase can be added if there is any concern by the public of specific contests that are close. It is mainly important that the public knows this option is available because any temptation to attempt to hack the election in this manner will be reduced.
The physical ballots utilized by this phase should not be touched by election staff after the precincts (or batches) of ballots are chosen, and the seals should be intact when they are ultimately rescanned. Thus, the precincts (or batches) should be kept isolated and easily accessible in storage. The exact strategy to be used in this phase will depend on the voting system and how ballots are already being stored, and must be discussed with our election experts.
Voter Intent
"Voter intent" refers to the will or preference of a voter as expressed through their vote. It aims to capture the true choice of the voter, even if the ballot is marked ambiguously, contains errors, or does not fully comply with voting instructions. The concept is often used in elections to ensure that a vote reflects the voter's intended selection, particularly in cases where ballots are being recounted or where disputes arise about the validity of a vote.
For example, if a voter marks a ballot incorrectly but their intended candidate is clear (such as circling a candidate's name instead of filling in a bubble), "voter intent" could be used as the guiding principle in determining whether to count the vote. It is frequently discussed in relation to recounts, contested elections, and electoral procedures.
State law may vary and in some cases, machine evaluation is used instead of voter intent, i.e. how the machine would interpret it, regardless of
Voter-Verifiable Paper Audit Trail (VVPAT)
A small printer that can be added to a DRE device to provide a paper record of each vote on a paper tape. The paper tape is enclosed behind a window so the voter can verify that their vote as printed correctly reflects their vote. Unfortunately, this only can verify that the vote is correctly printed, and does not necessarily reflect what is recorded in memory. This type of record is sequential and can be easily linked to the voter, is relatively difficult to audit, and frequently, the cheap printers would fail or overprint. AuditEngine cannot audit these records.
Voting System
A general term to refer to the system used by the district to conduct the election, including the EMS and voting machines, etc. These systems are Election Systems & Software (ES&S), Dominion Voting Systems (Dominion), Hart Intercivic (Hart), or any others. AuditEngine strives to be compatible with any voting system that can provide adequate data files in the form of ballot images, CVRs, Ballot Style Masters and other related data. See also Election management system (EMS).
Voting System for All People (VSAP)
A voting system custom designed for Los Angeles County and to date, is only used by that county. It uses a touch-screen interface and produces Ballot Summary Cards with QR Codes that encode the vote of the voter. In this case, the QR Codes can be read by any smartphone and the output compared with notations on the printed ballot summary card.
Workflow
How the work will be done in an audit, particularly when working in concert with state and county operations. The following grades are defined:
-
Public Oversight Workflow This workflow does not require any special actions by election staff other than publishing data files. This is the default mode of operation when AuditEngine is run by civic groups, candidates or campaigns without much interaction with election staff. This workflow is typically run after Certification and is dependent on the availability of ballot images and CVR files, which may be delayed until after certification. Because they are run after certification, audits using this workflow may still impact elections if there are significant findings in local contests but there is typically not a direct route to changing the outcome. But more importantly, they allow the campaigns to have all their questions answered and accept the outcome. This is particularly important for those candidates and their supporters who did not win. This workflow also requires the least amount of data as the Logic and Accuracy Test (LAT) ballot images nor the LAT CVR files are required, because the turnaround time is not optimized. However, there are two variants based on whether state law allows full public release of the data, as follows:
-
Public Oversight with Full Data Release Ballot images and CVR files are all available and the election district provides these. We don't believe that any redaction of additional marks is required on these ballots, because largely, we will only need to view a subset of the ballots in our reports, and it is generally not possible to link the voter to their ballot.
-
Public Oversight Workflow with Limited NDA If state law does not embrace full release of ballot images, then we can accept those under a Limited Non-Disclosure Agreement (Limited NDA). We would keep the source data on a secure server and when reports are produced, only provide those ballot images that are at issue.
-
-
Cooperative Workflow When audits are run in cooperation with election districts, turnaround time of results prior to Certification can be reduced by configuring AuditEngine prior to the start of the election using:
- Ballot Contest Names and Ballot Option names: We need a source of information for the exact text used on hand-marked paper ballots for contest names and option names. This could from the Logic and Accuracy Test (LAT) ballot images or an export of contest and option names from the voting system.
- CVR Contest Names and CVR Option Names: We need a source of information of the contest names and option names as used in the CVR. This could be the "known good" CVR of LAT ballots or an export of CVR metadata.
- Ballot Style Masters (BSMs) PDF files which must be in searchable format (not image scans).
This will allow AuditEngine to be fully configured prior to the election and ready to run when live election data is available without any further time consuming configuration.
An optional Verification Phase is an optional addition to this workflow, and requires handling of the paper ballots, and may require additional jurisdiction involvement (see details in the "Jurisdiction-Run Workflow", below)
-
Jurisdiction-Run Workflow To further optimize the workflow and reduce the reliance on personnel outside the control of election districts, the configuration of AuditEngine for audits in each election district can be easily delegated to staff in those districts. This is quite similar to Cooperative Workflow except that the work is being done by staff in each election district rather than by an outside team associated with AuditEngine. In this workflow there must be at least one project manager who is independent from the election districts assigned to each state to provide independent oversight.
In this workflow, staff in each district will:
-
Cooperate with workers of all the other districts in the state in the design of ballots. In this way, a common configuration can be used by AuditEngine to recognize the County and Election on each ballot.
-
Within the AuditEngine browser app, Create the Election and create two audits for that district.
- One audit uses LAT data (the "LAT Audit"), which will allow the full configuration and testing of AuditEngine, for this election (this assumes full ballot images and ballot-level CVR is available), and
- the other is for the "live" election data.
-
For the "LAT Audit":
- The Ballot Style Masters (BSMs) and LAT ballot images and LAT CVR files are uploaded.
- Run Phases 1 and 2 with the LAT images to create data for the map.
- Create the Target Map using the TargetMapper App, import the map and check the redline proofs for accuracy
- Optionally and preferably run Phases 3 and 4 using the LAT data as an audit to check that mapping. This incurs a nominal cost.
-
Then when live data is available:
- Upload the ballot images and CVRs to AuditEngine.
- Run Phase 1 (Metadata Analysis), but skip Phase 2 (Mapping) and move directly to Phase 3 (Vote Extraction) and 4 (Comparison and reporting). Assuming there are no issues, these phases can be run within a day.
-
If a Verification Phase is to be used by the jurisdiction, the precincts (or batches) are selected using rolls of ten-sided dice from a list of precincts (or batches). Weighting the batches by the count of ballots in them may be appropriate if there are some batches with significantly more or fewer ballots than the average. Then the precincts (or batches) are pulled, and without breaking seals from storage, are brought to a secure location where the boxes are opened and the ballots scanned using non-voting system scanners and with public observation. The stages of the Verification Phase include extracting the metadata from the scanned images, and performing extraction, and then comparing the result on an aggregated basis.
-
-
Paper-Centric Workflow Some jurisdictions may wish to primarily rescan paper ballots in a machine-assisted audit, and simultaneously conduct a ballot-image audit. For example, if existing regulations specifies that there be a review of 3% of the paper ballots and it is allowed to use non-voting system scanners and software to produce an independent result, except for the scanning process itself, AuditEngine can be used in this workflow. The same procedures would be used in the Cooperative Workflow regarding the use of the LAT ballot images and LAT CVR data to configure AuditEngine, and to cover all ballots, the voting system images can be used.
Write-in
If the write-in target is marked or if the Write-in Area is filled in, then it is considered that the voter intended to write-in the name for a write-in candidate. How this is handled is specific to different states, but most often, the written-in name must also be from a list of qualified write-in candidates.
Write-ins on BMD ballots do not need human-eye adjudication because the names are keyed in.
The name written-in should be examined in close contests even if there were no candidates that were qualified as write-ins for this contest, because the name written-in may be of a listed candidate, and if so, then the vote is accepted for that listed candidate. In cases when an overvote is initially detected and the written-in name is one of the listed candidates, it should be a single vote for that candidate. How this is handled is very state-specific.
Write-ins normally account for about 80% of Contest Variants. Since write-ins are very commonly misunderstood by voters, their use should be minimized.
Write-in Area
The write-in area is the area set aside for manual entry on Hand-Marked Paper Ballots. We suggest that there is a square box provided for user entry that does not have any other marks and will be blank if the user does not write anything in the box.
AuditEngine checks to see if there is anything written into the write-in area and if there is, attempts to convert it using Optical Character Recognition (OCR). The Write-in Area can be adjusted to match the area on the ballot relative to the Target.