SAS Programming Tutorial
Introduction
SAS (Statistical Analysis System) is one of the most powerful programming languages for data analytics, statistical modeling, and business intelligence. It is widely used in industries such as healthcare, finance, retail, and government for data-driven decision-making.
“SAS Programming Tutorial #1 covers key steps and formulas for data analysis. SAS coding, data manipulation, and statistical techniques are explained.”
In this tutorial, we will cover everything from the basics of SAS programming to advanced topics like PROC SQL, SAS Macros, and statistical procedures. Whether you’re a beginner or an experienced analyst, this guide will help you understand and master SAS programming.
What is SAS?
SAS is a software suite used for advanced analytics, business intelligence, and data management. It allows users to manipulate, analyze, and visualize data efficiently
SAS Environment
SAS has different interfaces, including:
- SAS Studio: Web-based interface.
- SAS Enterprise Guide: Point-and-click interface for data analysis.
- Base SAS: Command-line interface for coding.
Installing and Setting Up SAS
- Download SAS from the official website or use SAS OnDemand for Academics (free for students and learners).
- Install and set up your environment.
- Navigate through the SAS interface.
SAS Programming Basics
SAS programs consist of DATA steps and PROC steps. The DATA step is used for data manipulation, while the PROC step is used for analysis and reporting.
1 Writing Your First SAS Program
DATA sample;
input Name $ Age;
data lines;
John 25
Lisa 30
Mike 28
;
RUN;
PROC PRINT DATA=sample;
RUN;
2 SAS Libraries and Datasets
- SAS datasets are stored in libraries.
- WORK is the default temporary library.
- Creating a permanent library:
LIBNAME mylib ‘C:\SASData’;
3. Importing and Exporting Data
- Importing an Excel file:
PROC IMPORT DATAFILE=’C:\data.xlsx’
OUT=workmate
DBMS=xlsx
REPLACE;
RUN;
- Exporting data to CSV:
PROC EXPORT DATA=work.mydata
OUTFILE=’C:\output.csv
DBMS=csv
REPLACE;
RUN;
3.1 Crying Datasets
DATA newdata;
SET sample;
AgeGroup = IFN(Age >= 30, ‘Adult’, ‘Young’);
RUN;
2 Sorting and Filtering Data
- Sorting datasets:
PROC SORT DATA=sample;
BY Age;
RUN;
- Filtering data:
DATA filtered;
SET sample;
WHERE Age >= 30;
RUN;
3 Merging Datasets
- Combining multiple datasets using the MERGE statement:
DATA merged;
MERGE data1 data2;
BY ID;
RUN;
4.Functions OF SAS Programming Tutorial
SAS provides a wide range of functions to perform operations on data. These functions are categorized into different types based on their usage.
1 Character Functions
- UPCASE() and LOWCASE() – Convert text to uppercase or lowercase.
DATA new;
SET sample;
Name_Upper = UPCASE(Name);
Name_Lower = LOWCASE(Name);
RUN;
- SUBSTR() – Extracts a portion of a string.
DATA new;
SET sample;
First_Letter = SUBSTR(Name, 1, 1);
RUN;
2 Numeric Functions
- UM() – Adds values together.
DATA new;
SET sample;
Total_Age = SUM(Age, 5);
RUN;
- ROUND() – Rounds a number to the nearest integer or specified decimal.
DATA new;
SET sample;
Rounded_Age = ROUND(Age, 1);
RUN;
3.Date Functions
- TODAY() – Returns the current date.
DATA new;
Current_Date = TODAY();
RUN;
- INTCK() – Calculates the difference between two dates.
DATA new;
SET sample;
Age_Difference = INTCK(‘YEAR’, ’01JAN2000’D, ’01JAN2023’D);
RUN;
5. Statistical Procedures in SAS
1 .Descriptive Statistics
- Using PROC MEANS for summary statistics:
PROC MEANS DATA=sample;
VAR Age;
RUN;
- Frequency distribution with PROC FREQ:
PROC FREQ DATA=sample;
TABLES AgeGroup;
RUN;
2.Hypothesis Testing
- Performing a T-Test:
PROC TTEST DATA=sample;
CLASS AgeGroup;
VAR Age;
RUN;
3. Regression Analysis
- Running a linear regression model:
PROC REG DATA=sample;
MODEL Age = Height Weight;
RUN;
6. Data Visualization in SAS
SAS provides various tools for data visualization.
1. Creating Basic Graphs
- Bar chart using PROC SGPLOT:
PROC SGPLOT DATA=sample;
VBAR AgeGroup;
RUN;
2.Scatter Plots
- Visualizing relationships between two variables:
PROC SGPLOT DATA=sample;
SCATTER X=Age AND=Height;
RUN;
7. Advanced SAS Topics
1 .PROC SQL: Using SQL in SAS
- Selecting data with PROC SQL:
PROC SQL;
SELECT Name, Age FROM sample WHERE Age > 25;
QUIT;
2 .SAS Macros: Automating Tasks
- Defining a simple macro:
%MACRO print data;
PROC PRINT DATA=sample;
RUN;
%MEND printdata;
%printdata;
3 .Handling Large Datasets
- Using indexing for faster processing.
- Optimizing SAS programs with efficient coding practices.
Advanced SAS Tutorials: SAS Macros
SAS Macros are powerful for automation, code reusability, and dynamic programming. They allow you to parametrize code and execute repetitive tasks efficiently.
1. Macro Variables
Macro variables store text or values and can be global or local.
Creating Macro Variables
sas
CopyEdit
%let name = Anu;
%put Hello, &name!;
🔹 %let assigns a value to a macro variable.
🔹 &name retrieves the value.
🔹 %put prints the result in the log.
Using CALL SYMPUTX in a Data Step
sas
CopyEdit
data _null_;
call symputx(‘myVar’, ‘SAS Macro Tutorial’);
run;
%put &=myVar;
🔹 CALL SYMPUTX creates a macro variable from within a data step.
2. Macro Functions
These functions help manipulate macro variables dynamically.
Example: %SYSFUNC for Date Handling
sas
CopyEdit
%let today = %sysfunc(today(), date9.);
%put Today’s Date: &today;
🔹 %SYSFUNC allows the use of SAS functions within macros.
3. Writing Macro Programs
A macro program can generate dynamic SAS code.
Basic Macro with Parameters
sas
CopyEdit
%macro report(year);
proc print data=sales;
where year = &year;
run;
%mend report;
%report(2024);
🔹 %macro defines the macro.
🔹 &year is a parameterized variable.
🔹 %mend ends the macro.
4. Conditional Processing in Macros
Macros can use conditional logic (%IF-%THEN, %DO-%END).
Example: Conditional Execution
sas
CopyEdit
%macro checkSales(sales);
%if &sales > 1000 %then %do;
%put High Sales!;
%end;
%else %do;
%put Low Sales!;
%end;
%mend checkSales;
%checkSales(1200);
🔹 %IF-%THEN-%DO enables conditional execution within macros.
5. Looping in Macros (%DO Loops)
Loops allow repetitive execution.
Example: Macro Loop
sas
CopyEdit
%macro loopExample;
%do i = 1 %to 5;
%put Iteration: &i;
%end;
%mend loopExample;
%loopExample;
🔹 %DO-%TO-%END runs a loop inside the macro.
6. Using PROC SQL with Macros
Macros can dynamically create queries.
Example: Dynamic SQL Query
sas
CopyEdit
%macro getSales(region);
proc sql;
select * from sales where region = “®ion”;
quit;
%mend getSales;
%getSales(North);
Key Benefits of SAS Macros
✅ Reduces code duplication
✅ Increases efficiency for repetitive tasks
✅ Enables dynamic programming
✅ Enhances code flexibility
Analytics & Statistics OF SAS Programming Tutorials
SAS is widely used for data analytics, predictive modeling, and statistical analysis. Below is a structured tutorial covering key SAS statistical techniques with practical examples.
1. Descriptive Statistics
Problem: Summarize key data insights.
Solution: Use PROC MEANS, PROC UNIVARIATE, and PROC FREQ.
Example: Summary Statistics
sas
Copy Edit
proc means data=sales mean median std min max;
var revenue;
run;
🔹 Computes mean, median, standard deviation, min, and max for revenue.
Example: Distribution Analysis
sas
CopyEdit
proc univariate data=sales;
var revenue;
histogram revenue / normal;
run;
🔹 Displays a histogram with a normal curve to assess data distribution.
Example: Frequency Counts for Categorical Variables
sas
CopyEdit
proc freq data=customers;
tables gender region;
run;
🔹 Counts occurrences of categorical values.
2. Hypothesis Testing (T-Test, ANOVA, Chi-Square)
Problem: Compare groups to determine statistical significance.
Solution: Use PROC TTEST, PROC ANOVA, and PROC FREQ.
Example: Independent T-Test (Comparing Two Groups)
sas
CopyEdit
proc ttest data=sales;
class region;
var revenue;
run;
🔹 Compares mean revenue between two regions.
Example: ANOVA (Comparing More than Two Groups)
sas
CopyEdit
proc anova data=sales;
class region;
model revenue = region;
run;
quit;
🔹 Checks if average revenue differs significantly across multiple regions.
Example: Chi-Square Test for Association
sas
CopyEdit
proc freq data=customers;
tables gender*purchase / chisq;
run;
🔹 Tests if gender influences purchase behavior.
3. Correlation & Regression Analysis
Problem: Measure relationships between variables.
Solution: Use PROC CORR and PROC REG.
Example: Correlation Analysis
sas
CopyEdit
proc corr data=sales;
var revenue marketing_spend;
run;
🔹 Checks if marketing spend affects revenue.
Example: Linear Regression (Predict Sales Based on Advertising Spend)
sas
CopyEdit
proc reg data=sales;
model revenue = advertising_spend;
run;
quit;
🔹 Predicts revenue based on advertising spend.
4. Logistic Regression (Binary Outcome Prediction)
Problem: Predict customer churn (Yes/No).
Solution: Use PROC LOGISTIC.
sas
CopyEdit
proc logistic data=customers;
model churn(event=’1′) = age income purchase_frequency;
run;
🔹 Predicts customer churn based on demographics and purchase behavior.
5. Time Series Analysis (Forecasting Trends)
Problem: Forecast future sales.
Solution: Use PROC TIMESERIES and PROC ARIMA.
Example: Simple Time Series Analysis
sas
CopyEdit
proc timeseries data=sales out=forecast;
id date interval=month;
var revenue;
run;
🔹 Analyzes monthly revenue trends.
Example: ARIMA Forecasting Model
sas
CopyEdit
proc arima data=sales;
identify var=revenue;
estimate p=1 q=1;
forecast lead=12 out=predictions;
run;
🔹 Predicts future sales for the next 12 months.
6. Cluster Analysis (Customer Segmentation)
Problem: Group customers based on behavior.
Solution: Use PROC FASTCLUS or PROC CLUSTER.
sas
CopyEdit
proc fastclus data=customers maxclusters=3 out=clusters;
var income purchase_frequency;
run;
🔹 Groups customers into 3 clusters based on income and purchases.
7. Principal Component Analysis (PCA) for Dimensionality Reduction
Problem: Reduce multicollinearity in high-dimensional datasets.
Solution: Use PROC PRINCOMP.
sas
CopyEdit
proc princomp data=customers out=pca_scores;
var age income purchase_frequency;
run;
🔹 Extracts principal components for feature reduction.
8. Decision Trees (Classification & Prediction)
Problem: Build a model to classify loan defaulters.
Solution: Use PROC HPSPLIT.
sas
CopyEdit
proc hpsplit data=loans;
class loan_status employment_status;
model loan_status = income credit_score;
run;
🔹 Builds a decision tree model for loan risk classification.
Key Takeaways
1.SAS provides robust statistical tools for analytics
2 .Regression, clustering, and decision trees help with predictions
3.Time series forecasting is crucial for business trend analysis
Automate reports with macros & ODS
Would you like industry-specific case studies (e.g., healthcare, banking, retail)
SAS Programming Basics
Concept | Description | Example |
---|---|---|
DATA Step | Used to create and modify datasets. | DATA mydata; SET olddata; RUN; |
PROC Step | Used for analysis and reporting. | PROC MEANS DATA=mydata; RUN; |
LIBNAME Statement | Assigns a library reference to a directory. | LIBNAME mylib 'C:\SASData'; |
IMPORT Data | Reads external files (CSV, Excel, etc.). | PROC IMPORT DATAFILE="data.csv" OUT=mydata DBMS=CSV REPLACE; RUN; |
EXPORT Data | Writes SAS datasets to external files. | PROC EXPORT DATA=mydata OUTFILE="output.csv" DBMS=CSV REPLACE; RUN; |
IF-THEN Logic | Applies conditional operations. | IF age > 18 THEN status = "Adult"; |
Loops (DO, DO WHILE) | Used for iteration and repetitive tasks. | DO i = 1 TO 10; x = i*2; END; |
MERGE Statement | Combines datasets based on a common variable. | DATA merged; MERGE data1 data2; BY ID; RUN; |
PROC SQL | Performs SQL queries within SAS. | PROC SQL; SELECT * FROM mydata; QUIT; |
Macro Variables | Automates repetitive tasks with dynamic values. | %LET name = John; %PUT &name; |
Trending Technologies in SAS (2025)
1. SAS Viya & Cloud-Based Analytics
- Cloud-native platform (AWS, Azure, Google Cloud).
- Faster AI & machine learning integration.
- Scalability & automation for large datasets.
🔹 Example: Running SAS on Cloud
sas
CopyEdit
cas mySession sessopts=(caslib=casuser timeout=1800);
- CAS (Cloud Analytic Services) enables parallel processing.
2. SAS + AI & Machine Learning
🔹 Why It’s Trending?
- Built-in AutoML for automated model selection.
- Deep learning support with Python & R integration.
- Enhanced text analytics & NLP.
🔹 Example: Machine Learning Model in SAS
sas
CopyEdit
proc hpsvm data=customer_data;
class Churn;
model Churn = Age Income Purchase_Frequency;
run;
- PROC HPSVM runs a Support Vector Machine (SVM) model.
3. SAS & Python Integration (SWAT, PROC PYTHON)
🔹 Why It’s Trending?
- Seamless integration with Python & Pandas.
- Run Python scripts inside SAS using PROC PYTHON.
🔹 Example: Running Python in SAS
sas
CopyEdit
proc python;
submit;
import pandas as pd
print(“Hello from Python in SAS!”)
endsubmit;
quit;
- Executes Python directly within SAS.
4. Real-Time & Streaming Analytics (IoT + SAS Edge AI)
🔹 Why It’s Trending?
- Edge analytics for IoT devices.
- Real-time fraud detection & predictive maintenance.
🔹 Example: Streaming Data in SAS
sas
CopyEdit
proc eventstream;
define data my_stream;
source “IoT_Devices”;
run;
- Analyzes live sensor data.
5. Low-Code & No-Code AI with SAS Visual Analytics
🔹 Why It’s Trending?
- Drag-and-drop AI modeling.
- No coding required for interactive dashboards.
6. SAS for Healthcare & Genomics
Why It’s Trending?
- AI-driven disease prediction.
- SAS Viya for medical research (COVID-19, cancer studies).
7. SAS in Risk & Fraud Analytics
🔹 Why It’s Trending?
- Banking & insurance fraud detection.
- AI-powered credit risk modeling.
Key Takeaways
✅ SAS Viya & Cloud Analytics dominate 2025.
✅ AI, ML, & Python integration are reshaping SAS.
✅ Real-time data processing & IoT analytics are booming.
Would you like detailed examples on any of these trends?
Working with Datasets in SAS
In SAS, datasets are the core of data analysis. Here’s a complete guide on how to create, modify, merge, and analyze datasets using SAS.
1. Creating a Dataset
Example: Creating a Simple Dataset
sas
CopyEdit
data employees;
input ID Name $ Age Salary;
datalines;
1 John 28 50000
2 Sarah 32 60000
3 Mike 40 70000
;
run;
proc print data=employees;
run;
🔹 DATA employees; – Creates a dataset.
🔹 $ – Indicates a character variable (e.g., Name).
🔹 DATALINES; – Allows manual data entry.
🔹 PROC PRINT – Displays the dataset.
2. Modifying a Dataset (Adding/Removing Columns & Rows)
Example: Adding a New Column (Calculated Field)
sas
CopyEdit
data employees_bonus;
set employees;
Bonus = Salary * 0.10; /* 10% bonus calculation */
run;
proc print data=employees_bonus;
run;
🔹 SET employees; – Reads from the existing dataset.
🔹 New variable (Bonus) is created based on Salary.
Example: Dropping a Column
sas
CopyEdit
data employees_new;
set employees;
drop Age;
run;
🔹 DROP Age; removes the Age column from the new dataset.
3. Filtering Data (WHERE & IF Statements)
Example: Using WHERE in PROC Steps
sas
CopyEdit
proc print data=employees;
where Salary > 55000;
run;
🔹 Displays employees earning more than $55,000.
Example: Using IF in DATA Steps
sas
CopyEdit
data high_salary;
set employees;
if Salary > 55000;
run;
🔹 Creates a new dataset with only high-salary employees
4. Sorting Data (PROC SORT)
Example: Sorting by Salary (Ascending Order)
sas
CopyEdit
proc sort data=employees out=employees_sorted;
by Salary;
run;
🔹 BY Salary; sorts in ascending order (default).
Example: Sorting in Descending Order
sas
CopyEdit
proc sort data=employees out=employees_sorted;
by descending Salary;
run;
🔹 DESCENDING keyword sorts in highest to lowest order.
5. Merging & Combining Datasets
Example: Merging Two Datasets Using PROC SQL
sas
CopyEdit
proc sql;
create table employee_sales as
select a.ID, a.Name, a.Salary, b.Sales
from employees a
left join sales_data b
on a.ID = b.ID;
quit;
🔹 Left join merges employee details with sales data.
Example: Appending Datasets (SET)
sas
CopyEdit
data all_employees;
set employees2023 employees2024;
run;
🔹 Combines datasets with the same structure.
6. Detecting & Handling Missing Values
Example: Finding Missing Values
sas
CopyEdit
proc means data=employees n nmiss;
run;
🔹 NMISS counts missing values in the dataset.
Example: Removing Missing Values
sas
CopyEdit
data cleaned_data;
set employees;
if missing(Salary) then delete;
run;
🔹 Deletes rows where Salary is missing.
7. Transposing Data (Long to Wide Format)
Example: Using PROC TRANSPOSE
sas
CopyEdit
proc transpose data=sales out=sales_wide;
by Product_ID;
id Month;
var Revenue;
run;
🔹 Converts monthly sales data from long to wide format.
Key Takeaways
✅ Create & modify datasets using DATA and PROC steps.
✅ Filter data using WHERE and IF.
✅ Sort, merge, and append datasets efficiently.
✅ Detect and handle missing values effectively.
Conclusion
SAS is a powerful tool for data analysis, statistical modeling, and business intelligence. Whether you’re a beginner or an experienced professional, mastering SAS can open doors to various career opportunities in data-driven industries like healthcare, finance, and retail.
To become proficient in SAS:
✅ Learn the fundamentals of Base SAS, including the DATA step and PROC step
✅ Practice using SAS Macros and SQL for efficient data processing
✅ Work with real-world datasets to apply concepts
✅ Utilize SAS OnDemand for Academics for hands-on experience
✅ Stay updated with SAS documentation and online communities
By consistently practicing and exploring advanced SAS techniques, you can enhance your analytical skills and become a valuable asset in data analytics and business intelligence. Keep learning, experimenting, and applying your knowledge to solve real-world problems!
FAQ
SAS (Statistical Analysis System) is a software suite used for data management, statistical analysis, business intelligence, and predictive analytics.
SAS is ideal for data analysts, business analysts, statisticians, data scientists, and professionals in industries like healthcare, finance, and retail who work with large datasets.
No, SAS is beginner-friendly. However, familiarity with basic programming concepts or SQL can be helpful
- SAS Base (Data step & PROC step)
- SAS Macro (Automating tasks)
- SAS SQL (Working with databases)
- SAS Statistical Procedures (PROC MEANS, PROC REG, etc.)
You can download SAS from SAS official website or use SAS OnDemand for Academics, which is free for learning purposes.
- PROC PRINT – Displays data
- PROC MEANS – Computes summary statistics
- PROC FREQ – Generates frequency tables
- PROC SQL – Works with databases using SQL
- PROC IMPORT/EXPORT – Reads and writes external files
SAS is enterprise-focused with strong support for data management and reporting. Python & R are open-source, flexible, and widely used in machine learning.
You can use SAS OnDemand for Academics or explore online SAS coding platforms.
SAS Macros automate repetitive tasks and make programs more dynamic by reducing manual coding.
You can use:
- PROC IMPORT for Excel/CSV files
- LIBNAME Statement for databases
- INFILE Statement for text files
- Use LOG window to check messages
- Use OPTIONS MPRINT MLOGIC SYMBOLGEN for macro debugging
- Run the program step by step to isolate issues
SAS skills are in demand in banking, healthcare, insurance, and government sectors. Roles include:
- SAS Analyst
- Data Analyst
- Business Intelligence Analyst
- Clinical SAS Programmer
SAS offers various certifications like:
- SAS Base Programmer
- SAS Advanced Programmer
- SAS Certified Data Scientist
You can take courses from SAS or other platforms like Coursera and Udemy.
- SAS Documentation (support.sas.com)
- Online courses (Coursera, Udemy, LinkedIn Learning)
- YouTube tutorials
- SAS Communities & Forums
- Practice regularly with datasets
- Work on real-world projects
- Join SAS user communities
- Follow structured learning paths