Data management
Importing/exporting data
- Import and export data from Excel .xls and .xlsx files
- Import and export CSV and delimited data
- Copy/paste data from spreadsheets
- Input data in spreadsheet editor
- Read from and write to SQL sources with ODBC (see below)
- Import and export fixed-format data using a dictionary
- Import and export any type of text data
- Unicode (UTF-8) support, including conversion from/to extended ASCII
- Import EBCDIC data and convert EBCDIC to ASCII
- Import and export data in the format required by the FDA for NDA submittals
- Import and export SAS Transport XPORT files
- Import Federal Reserve Economic Data
- Import from Haver Analytics databases
- Import data from Wharton Research Data Services (WRDS) via ODBC
- Import and export dBase files
- High-level import/export of full Excel worksheets
- Low-level cell-by-cell access to write results to and read data from Excel, including graphs, formulas, date formats, currency formats, bold, italics, and more
ODBC support
- Import data from any ODBC data source, such as Oracle, SQL Server, Access, Excel, MySQL, and DB2
- Export data to new or existing ODBC tables
- Execute custom SQL commands individually or in batches
- Customize ODBC connection strings
- Support for ODBC
- Support for VARCHARs/CLOBs and BLOBs
- Support for Unicode
Built-in spreadsheet editor
- Clipboard Preview Tool lets you control how data will be pasted
- Manage variables with the Variables Tool
- For Windows, Mac, and Unix
Properties window
- Manage variables
- Manage dataset properties
- For Windows, Mac, and Unix
Variables Manager
- Change storage types, names, and formats
- Add and edit value labels
- Attach notes to variables
- Filter variables
- For Windows, Mac, and Unix
Functions
- Statistical functions
- Mathematical functions
- Trigonometric functions
- String functions
- Unicode functions
- Regular expressions
- Date and time functions
- Time-series functions
- Random-number functions
- 18 functions
- Stream random numbers
- Matrix functions
- Programming functions
Data reorganization
- Row–column transposition
- Data reshaping
- Stacking of variables
- Collapsing into means, totals, etc.
Unicode support
- UTF-8
- Translation of extended ASCII to UTF-8
- Unicode-aware string functions
- Locale-based sorting and string comparison
Labels
- Dataset labels
- Variable labels
- Value labels (e.g., male and female for 0 and 1)
- Ability to switch between multiple sets of data, variable, and value labels
- Missing-value labels
- Support for multiple languages, including Unicode support
Notes
- Extensive notes can be attached to a dataset
Data snapshots
- Allow multiple levels of undo to modified datasets
Automatic memory management
- Up to 1.5 TB of RAM supported
- Up to 120,000 variables in Stata/MP; up to 32,767 variables
in Stata/SE
- 20 billion or more observations in Stata/MP
- Up to 2.1 billion observations (Stata/SE and Stata/IC)
Sorting
- Ascending or descending sorts
- Multiple-key sorts
- Numeric and string sorts
- Locale-aware Unicode string sorting and comparison
Combining datasets
- Merge datasets
- By key variables
- By observations
- Join datasets
- Outer join
- Append datasets
- Append time series
Special datasets
- Longitudinal data/panel data
- Survival/duration data
- Time series
- Survey data
- Multiple imputations
Dynamic document generation
- Create HTML files with embedded Stata code, output,
and graphs
- Markdown
Creation of Word, Excel, and PDF files
- High-level creation of Word documents containing Stata results and graphs
- Low-level programmatic access for fine-control creation of Word documents
- High-level creation of PDF files containing Stata results and graphs
- Low-level programmatic access for fine-control creation of PDF files
- High level import/export of full Excel worksheets
- Low-level cell-by-cell access to write results to and read data from Excel, including graphs, formulas, date formats, currency formats, bold, italics, and more.
Image output
- Save graphs as PDFs
- Save graphs to EPS or TIFF files for publication
- Save graphs to PNG or SVG files for the web
Utilities
- Count number of observations that satisify specified conditions
- Formatted and unformatted disk I/O
- Zip-file support
- Unicode conversion from/to extended ASCII
- Custom filters to manipulate text files
Variable management
- Generation of new variables
- Replacement of existing variables
- Renaming variables
- Encoding and decoding string variables
- Reordering variables in dataset
Dataset utilities
- Flexible description of variables, labels, and types
- List values of variables
- Data signatures to verify the integrity of datasets
- Codebooks for variables
- Value-label reports
- Duplicates and missing values tables
- Compress (make dataset as small as possible without loss of accuracy)
Variable types
- Numeric storage types
- Byte
- Integer (int)
- Long
- Float
- Double
- String (including Unicode, very long strings and BLOBs)
- Dates and times
- Business calendars
Long string support
- Up to 2 billion character long strings
- Coalescing of duplicate values to save memory
- Binary 'strings' (BLOBs)
- Import and export entire files into long strings/BLOBs
- Unicode (UTF-8) strings
Stored results
- Save results to disk for later use
- Store estimation results in memory
- Create tables to compare results