Frequently Asked Questions


What is BigMouth?

BigMouth is a centralized oral health data repository derived from electronic health records (EHRs) at multiple dental schools participating in the Consortium of Oral Health Research and Informatics (COHRI). With demographic, medical history and dental data on over 3 million million patients, this oral health database is available for research and the advancement of evidence-based dentistry.

Back to top

Who are the participating institutions?

As of December 2020, we have ten institutions who have contributed data to BigMouth. That includes Harvard School of Dental Medicine, University of California San Francisco School of Dentistry, University of Michigan School of Dentistry, UTHealth School of Dentistry at Houston, Tufts University School of Dental Medicine, University of Colorado School of Dental Medicine, Loma Linda University School of Dentistry, University of Buffalo School of Dental Medicine, The University of Iowa College of Dentistry, The University of Minnesota School of Dentistry and University of Pittsburgh School of Dental Medicine). We are expecting more institutions to participate in the coming years.

Back to top


What data do I have access to?


Upon login, a user is presented with two folders. The folders are:

a.       axiUm - Contains data from the user's own local axiUm Instance.

b.      COHRI - Contains data from all the participating sites. This folder is normally used when a user is interested in querying data from all COHRI members. The folder is designed in such a way that the user is able to get a summary from all sites at the same time keeping the source of data anonymous.

Back to top

What does axiUm folder contain?


The axiUm node contains the (user's) site specific terminologies. Users can query for data available in axiUm at their own site. The number displayed adjacent to each folder/leaf node represents the number of patients.


'         Demographics: contains age, gender, and race/ethnicity. User can query age by value with the following operators: <, >, ≥, ≤, =, between.

'         Diagnosis: contains Dental Diagnostic System (DDS) terminologies, formerly known as the EZCodes diagnostic terminology. User can query a diagnosis with or without date range.

'         Forms: contains medical, dental history of patients, and caries risk assessment. In addition, each site may have other specific forms. User can query a form question with or without date range.

'         Insurance: contains the names of insurance companies that insure patients.

'         Odontogram: contains existing materials, existing conditions on a specified tooth or surface and the total number of missing teeth.

'         Perio: contains clinical periodontal parameters which could be queried by examination types (e.g. initial examination, re-evaluation examination).

'         Medication: contains prescription and patient's current medication information and can be queried with or without date range.

'         Practice: contains details about the different practices in a site.

'         Procedures: contains Dental Procedure Codes (CDT) and Current Procedural Terminology (CPT) procedures which could be queried with or without date range.

'         Providers: contains different dental health providers such as dental student, dentist, resident etc.


Back to top

How is the age calculated?


Age is calculated from the date of patient's last recorded observation (procedure, diagnosis etc.).


Back to top

What does the COHRI folder contain?


This folder provides an integrated terminology system that allows users to query for data from all schools that contribute data to BigMouth.


'         Demographics: contains age, gender, and race/ethnicity from all schools.

'         Diagnosis: contains DDS terminologies from Harvard, UCSF, and UTH.

'         Forms: contains medical and dental history from all schools. COHRI terminology contains caries risk assessment from all schools except UPITT.

'         Odontogram: contains existing materials, existing conditions on a specified tooth or surface and the total number of missing teeth from all schools.

'         Perio: contains clinical periodontal parameters from all schools.

'         Procedures: contains CDT and CPT procedures from all schools.

'         Providers: contains different dental health providers from all schools.


Back to top

Where is the Race/ethnicity standard derived from?


Race/ethnicity standard derived based on NIH guidelines.

Back to top


How can I get data for my research from BigMouth?


The BigMouth interface provides query tools to user to get the number of patients having conditions or diseases of interest such as number of patients with caries or periodontal diseases, etc...

There are two levels of access:

Level 1: Users who have access to BigMouth can run queries themselves and use the count for their research.

Level 2: If the user wants to get detailed data for each individual patient, a3\ copy of the IRB approval document will need to be provided along with the data request template. The template will facilitate us to extract the requested data which will be presented in an Excel sheet with requested variables in columns and individual patients in rows.

In addition to it, the users are also required to sign a data access request form. If the user is interested in accessing patient data from outside the school to which the user is affiliated, in addition to the IRB approval document from the user's school, the project will need to be approved by the COHRI project review committee. The following documents contain all the required information:

a.       Clinical Research Committee Checklist

b.      COHRI Clinical Research Committee Proposal Guidelines

Please contact the BigMouth team for details of the form.

Back to top

How can I make a 2x2 table using BigMouth? - For example, I am interested in a diagnosis A and a treatment B. I would like to see how many patients with that diagnosis A had that treatment, and how many did not. - I'd also like to see how many without that diagnosis A had that treatment B how many did not.


This can be achieved using the exclude/include option in each column. Here are few screen shots:

Yes Diagnosis, Yes treatment

No Diagnosis, Yes treatment

Yes Diagnosis, No treatment

No Diagnosis, No treatment


Back to top

How can I make a temporal query?


There are five basic steps in defining a temporal query in the Query Tool view:

  1. Change Temporal Constraint to Define sequence of Events.
  2. Define Population in which events occur (optional step).
  3. Define Events
  4. Define order of events (temporal relationships)
  5. Run the query

Step 1: Change Temporal Constraint

The first step is to change the Temporal Constraint to Define sequence of Events.

1700_WC_tempQry-1.png (780'230)

Step 2: Define Population in which events occur

Once you have changed the Temporal constraint to be Define sequence of Events a new Page selection box will appear below the Temporal Constraint section. The default page will be Population in which events occur. It is on this page you will define your population requirements.

1700_WC_tempQry-2.png (757'332)


Step 3: Define Events

The events are the first component of a temporal query. There are no restrictions on the number of events you wish to define. The only requirement is that you have to define at least two events.

To define the events simply click on the Page selection box and select Event 1 from the drop-down list.

1700_WC_tempQry-3.png (706'199)

The groups and constraints for the events work in the same manner as they did for a traditional i2b2 query. Simply drag the items you want to include in Event 1 to the appropriate groups.

Once you have added your items to the groups you can click on the Page selection box and select Event 2 from the drop-down list. This will change the page to display the groups for Event 2.

If you need to add a third event you can click on the New Event button located next to the Page selection box.

Step 4: Define Order of Events (Temporal Relationship)

As stated earlier the second component of a temporal query is the relationship between the events (temporal relationship). In the i2b2 Web Client this is done on the Define order of events page, which is accessed by clicking on the Page selection box and selecting Define order of events from the drop-down list.

The page will display as follows:

1700_WC_tempQry-4.png (757'624)

Step 5: Run Query

In the Web Client running a query works the same regardless of whether or not it is a Temporal Query or a traditional i2b2 query. By clicking on the Run Query button, the i2b2 client will send the request to the i2b2 server which will run the query as defined.

Back to top

How do I query for a specific date range?


Users can specify the date range of the observation as shown below



Back to top

Why do the patient counts not add up in the navigation tree?


The reason is because one patient may have more than one specific condition. For example, total number of patients of the category 'Abnormalities of teeth' in the picture below is 984. The number of patients of the subcategories of the category 'Abnormalities of teeth' adds up to 1041. This is because one or more patient may be in both subcategories such as 'Cementum Defect' and 'Dentin Defect'.