Introduction
Database management is an important aspect of research studies that requires a clear understanding of the problems under investigation as well as the formal data processing that will be needed by the study. Success at achieving this combination is rare. To address this deficit, the ISAP Data Management Center (DMC) was designed to allow principal investigators (PIs) to focus entirely on the management of protocols and the design of the survey instruments, while the DMC handles programming to facilitate the data entry process and ensure that the data are valid. In addition, the DMC provides the PI with ongoing tracking of the study’s progress, which is perhaps the DMC’s most valuable role.
Web-based Data Entry
Advances in Web-based computer technologies offer new promise for researchers, including enhanced approaches to data collection and multidisciplinary data management. In comparison to more traditional approaches to research data management, Web-based technologies allow simultaneous data entry and extraction by all investigators regardless of geographic location. In addition, changes in data capture can be made in a timely fashion in order to maximize the quality of data being collected. Site requirements are minimal for basic data entry—just reliable Internet access and a Web browser. Client-side prompts and validation checks assist in the data entry process, including both within-form and cross-form checks.
Web-based data collection does have its disadvantages. First, this type of data collection requires that remote computers have a reliable connection to the Internet and so there is always the threat of downtime. Second, some of the methods often used to transmit and validate data such as XML or JavaScript can be lengthy and add to the amount of information that has to be passed between the client and the server. Finally, Web forms often are created as static (or pre-defined) forms that contain one long series of questions that require constant scrolling, which may interrupt the responder’s thought process. If the questionnaire has to be provided in multiple languages, creating the exact Web form in each of the different languages can be challenging and time intensive.
DMC Solutions
While ISAP’s Data Management Center cannot fully alleviate the first two disadvantages, it can alleviate much of the third. The DMC offers a vast library of forms (811 as of April 2007) which can be converted into static Web-based entry forms. New forms can be added to the library by using Cardiff’s Teleforms as the form generator. Scrolling can be minimized by programming the form so that the client “jumps” from one question to the next appropriate question (skip pattern). While implementing changes to Web-based forms can be labor intensive, once programmed, such changes are fully adopted across all new interviews. Traditional paper forms typically require time to disseminate the updated instrument(s) to all staff.
To compare all data capture systems managed by the DMC, please see this Web page: http://www.uclaisap.org/dmc/html/capture-system.html
Features of Web-Based Data Collection
Key field data, such as header information, can be pre-populated, and custom error messages can be programmed to appear either when a given question receives a response or when the entire form is submitted. These error messages can be formatted either as warnings (which allow the record to be submitted) or as stops (preventing the record from being submitted). Forms can be printed directly from the Internet browser, providing a back-up copy of the responses.
Studies often choose to create a lead-in Web page, which controls access to the various forms expected to be completed during a given study visit. As forms are completed, they can be “locked,” preventing duplicate entry. Once data are submitted, a unique serial number is generated and provided to the user. This serial number allows the supervisor to view and modify the data as needed. Additionally, considering that some surveys are completed during two or more sessions, this serial number allows data to be added to a partially completed interview (with supervisor-level access).
Security of the Web forms is of primary concern. Users cannot assign to themselves the rights of data entry, but rather the project investigator must specify which users can enter data and which can modify existing data. This permission is revocable at any time.
Lastly, reports are created according to the needs of the study. These Web-based reports reflect the up-to-the-minute status of the study’s data. Such reports often summarize a patient’s progress through the study and show detailed information on the number of screened participants and the number of follow-up interviews completed. Additionally, the reports can provide global views of participant demographics, participant intake characteristics, or treatment provided. Reports provide the PI with a method to monitor certain critical measures, which can be invaluable in assuring that the data collection is proceeding as expected and that it will fulfill the study goals.
Conclusion
Data collection across the Internet provides a secure and reliable method to process study data and allows for remote data input from as many sites as needed in as many locations as needed. Resulting data are housed in the DMC’s SQL server and will be given back to the PI (or designate) in any required format, according to their schedule demands.
While no perfect data collection system exists, data entry across the Internet is a fast, reliable, and secure method to capture data. It also reduces paperwork (with associated costs) required by more traditional forms of data entry and is perfectly suited for large, multi-site trials as well as the one-site, ad-hoc research project.
FileMaker Pro allows you to easily build customized data entry applications to suit your needs. The data collection forms can be developed for single users or multiple users in a networked environment using the FileMaker Pro server. Whenever the FileMaker Pro server is used, the data transmission between the remote workstations and the server can be encrypted using SSL technologies. All data on the FileMaker Pro server will be protected both by user name and password, with permissions assigned to either users or groups as needed.
Once entered, the data can be exported into the needed format (SAS, SPSS, etc.).
The QDS™ Questionnaire Development System is a complete system for developing and administering data collection applications. QDS™ enables you to produce all materials needed to design and administer questionnaires. Form elements (either whole sections or specific questions) can be easily reused across new forms.
The QDS™ system consists of several Modules discussed below. You must pay a license fee for each module.
Design Studio
The first step in creating a QDS™ application is to define questionnaire
specifications in the Design Studio. Your specifications include all content
and settings for your questionnaire—e.g., question text, instructions,
branching instructions, consistency checks, etc.
Data Entry Module
The QDS™ Data Entry module is used to key data originally collected
on a paper form. The Data Entry module supports double entry (also
called key verification or double keying) for increased accuracy.
CAPI (Computer Administered Personal Interview)
The QDS™ CAPI module allows interviewers to conduct face-to-face interview
using the computer. The computer displays one question at a time and
allows the interviewer to enter responses in real time using the keyboard,
mouse, and/or touch screen. Entering the data into the computer at the time
of data collection eliminates the need for a separate data entry step. The
CAPI module allows you to include information/probes for the interviewer.
ACASI (Audio Computer Administered Self Interview)
The QDS™ ACASI module allows the respondent to be "interviewed" by
computer. The computer displays one question at a time and allows the
respondent to enter responses using the keyboard, mouse, and/or touch screen.
You may choose to have the computer read the questions and responses to the
respondent using a computer text-to-speech engine or recorded WAV files.
Entering the data into the computer at the time of data collection eliminates
the need for a separate data entry step. Because no interviewer is required
to conduct an ACASI interview, use of ACASI allows a single staff member
to supervise multiple interviews. ACASI is also useful for providing a private
setting for sensitive information.
Warehouse Manager
Once data are collected/recorded using one or more of the Data Collection
modules, you may import the data into the Warehouse Manager for data
management and export data for analysis.
Teleform is an Optical Character Recognition (OCR) program used to collect large amounts of data, both accurately and quickly. The program was developed as a natural progression to the old ScanTron technology. Data are collected (recorded onto a form) and the form is scanned or faxed into the system. After review by DMC staff, data is transferred into the study’s database.
New forms are designed within a form designer according to the needs of the protocol. Data points can include bubbles, numeric values, free text, signatures, and bar codes. While legibility will always be a factor, with properly completed bubbles, verification is limited to the hand-printed responses, eliminating the need to review/key enter much of a form. The data are validated using a custom written script, which checks for valid range, skip patterns/entry required, and cross-field or cross-form consistency.
MS Access is the database program included in the Microsoft Office suite. As with any database program, tables and user forms (GUIs) constitute the basic elements, but Access has three additional powerful functions: Queries, Reports, and VBA code, which are described below.
Queries
Queries allow the user to perform a partial analysis on the data without the need of SAS or SPSS. While not as powerful as these statistical programs, queries can be designed to identify out-of-range data, summarize data, or perform updates to the data. To properly analyze a full database, variables from multiple tables may need to be linked together; queries provide a method to easily link data from multiple tables.
Reports
Reports provide the user with pre-formatted/printable summaries of the data or the participants. Reports are graphically designed and easily customized. Often, queries are first needed (to aggregate the data) before the report can be produced.
VBA code
VBA (Visual Basics for Applications) code allows all of the higher-end functions that make using a database user friendly. These functions include automation of the form and the ability to “walk through” an entire table (to compare values between different rows). In addition, VBA can provide the user with complex pop-up messages about the data being entered, make “skip pattern” questions appear or vanish, and even perform such tasks as the creation of email messages or moving/copying/renaming files on a computer’s hard drive. Basically, any task that is repeatedly performed on a computer can benefit from the use of VBA code.
The DMC currently supports 3 projects by entering project data collected on paper forms into the web sites developed by the funding agencies. Once collected, data forms are either hand delivered or faxed to the DMC, where DMC staff enter the data into the project web site. Any resulting queries are resolved with direct contact between the DMC and the clinic.