Data Warehousing for the SUNY System AIRPO, Winter 2006 Maggie Moehringer [email protected]

SUNY “Data Warehouse” !

!

! !

!

A collection of data repositories (files, tables), with data geared to different functional and data usage needs Interrelated (or interrelate-able) at some level of data summarization and time slicing Read only May contain transactional detail but do not directly support transaction processing Optimized for self service for analysts and knowledge workers who need to create or execute queries/inquiries

SUNY Information Environment Components ! ! ! ! !

The information audience The data itself The data repositories Access: Tools to get to the data The “Plan”

Our Information Audiences !

Indirect users: ! ! ! ! ! !

!

Prospective students and parents The interested public Media NYS Senate and Assembly NYS executive/agencies (Governor, DoB, etc.) SUNY Board

Direct Users: SUNY System and campus functional office and analytical/planning staff

Direct “Hands On” Audience for the Data Warehouse !

System and campus functional offices !

Administrative/Operational Data Usage: !

!

Detailed, low level granular, current and historical, transactional; within a function.

System and campus analytical staff !

Analytical / Planning Data Usage: !

Longitudinal, comparative, cohort, statistical and projective purposes; cross functional; detailed, not transactional; stable time slice

Data: What We Have… and Don’t Have ! !

“We” = SUNY analytical staff Employees: ! ! !

!

Applicants/Applications: ! !

!

at State operated campuses…. but not at community colleges and not everyone who provides instruction. for ASC participants, but not all applications… and not non-participating campuses.

Student/applicant socio-economic and financial aid: !

None.

Data: What We Have… and Don’t Have (cont’d) !

Funding: ! !

!

Enrollment: ! ! ! !

!

that flows through state accounts… but not funding that flows through RF, CF or local campus accounts. as of the census date… but not changes in student enrollment after that, and not some populations that are funded and not unfunded activity.

Instructional activity/cost/workload: ! !

for the State Operated campuses… but not for Community Colleges.

Data: Major Frustrations !

!

! !

Production information systems often do not include the complete information necessary to support management inquiries and decision making. Knowledge workers are forced to bring together data from different sources, summary levels, and time slices, and must be very knowledgeable about data shortcomings. “Yes, we kind of have that info, but…” Hard to allow unfettered access, but we must figure out a way.

Data Repositories: Current Technology ! !

!

! !

Old legacy production systems New Oracle relational versions of old legacy data New Oracle star schema versions of old legacy data New Oracle systems Spreadsheets, summary data feeds, special compilations, etc.

Data Repositories: Future Technology !

!

Oracle instances supporting transactional systems and functional operations Oracle instances supporting reporting: ! !

Relational data bases Dimensional data bases

Future Repository Design: Getting our Staff There ! !

! !

!

!

!

1999: Short information gathering project. Technical staff: training for two people on data warehousing, dimensional modeling. Training for two staff on Oracle Warehouse Builder. 2001: One star in an area with good data (SDF Enrollment). Then two more stars (ASC Applicants, State Employees). Training for users and technical staff on a query tool (Oracle Discoverer). Refinement of extract, transformation and load (ETL) procedures.

Data Repository Design: DW Expertise on Campuses ! ! !

“Banner Reporting Initiative” Survey Expertise deficit on campuses. Ways to improve it: ! ! ! ! !

Some training Oracle tools Using what we have Collaborative assistance Possible product acquisition.

Tools to Access Data !

Major consideration: the security environment at System Administration " " " " "

"

"

UserID/Password Secured Web access is “portal” driven Web clients for most users Single sign-on Distributed maintenance of identification/authorization information Therefore, access to SUNY systems by client tools (Access, Cognos, etc.) with internal security that must be centrally maintained is an administrative issue Access to SUNY systems by clients tools is a support issue.

Tools: the Possibilities !

! ! ! !

!

At the simplest level, web pages can display pre-formatted data (HTML, PDFs, etc.); not enough. Custom Inquiries Canned Queries Distributed Datasets for static data Query or analytical tool in local use with downloaded data Direct query access.

Tools: Probabilities for Campus Access !

Custom inquiries, developed in Cold Fusion or Java (e.g. current SMRT for Finance) ! !

!

Canned queries (Discoverer) !

!

!

Developing “SMRT for Enrollment” Can be smarter than a dumb query Optimized, parameter driven for flexibility

Distributed Reports and Datasets for static data If it’s necessary, query access.

The Plan: The SMRT Environment ! !

! !

“SUNY Management Reporting Tool” S-M-R-T was intended for use as a general acronym. “The SMRT Portal” “SMRT for Enrollment”, “SMRT for Human Resources”, “SMRT for Academic Programs”…..

What belongs in the SMRT inquiry environment? !

! ! ! ! ! !

Designed to address this problem: “I can’t allow them to have access to my data because they don’t understand the data, they don’t know how to ask the question, they might make a mistake.” Guided, mistake-proof, supported by metadata, always inquiry only, and An inquiry that’s useful for users who are not working in the specific business area, OR An inquiry that’s useful for a broader audience than the specific business area user, and often A higher level inquiry than the most granular level of detail, and often Geared to users who are likely to want to see reporting out of multiple business areas OR Users who will not be using the transactional and update capabilities of the business application.

Measures of Success for SMRTs ! ! ! ! !

QUICKLY developed East to change, enhance Cover most of the need Easy to use Impossible to misinterpret the data.

SMRT Development Process ! ! ! ! !

!

Input at the System level Development of basic views Review with campus interest groups Enhancement and deployment Provision of “gap filling” queries and reports. Ongoing assessment and improvement with campus and system user groups.

DW/Reports User Interface Environment Facts Data Mart (non-Web)

SUNY.edu

Fast Facts

Employee Portal

Publicly Accessible Inquiries

Business Area Apps

SMRT Inquiries

Business area specific inquiries and output

Common Facilities

Business Applications DW facilities

Discoverer Viewer: Canned BA Query Output

Metadata UI Color Key:

SMRT Portal

Canned DW Query Portal and Query Output

Data Policies & Procedures Legacy Reports (temporary)

User SMRT/DW Documentation Interim Metadata

Candidate SMRT Inquiries ! !

Don’t wait for the perfect systems and DW; use what we have Pockets of readiness: HR, Enrollment, Academic Programs ! !

!

BUT serious data vacuums AND questions about SUNY wide access

SMRT for Enrollment: Helen Ernst – Technical Lead !

Of use to: ! ! ! !

! ! ! !

Budget analysts Enrollment managers Institutional researchers Executive management

SUNY wide data Requirements for the System office views defined Requirement for campus views needed Common features to all SMRTs: printer friendly version, Excel downloads, etc.

Other “SMRTs” !

SMRT for Academic Programs ! !

! ! ! ! ! ! !

Enrollment, degrees granted AND, through relationships, costs, staffing, ...

SMRT SMRT SMRT SMRT SMRT SMRT etc.

for Student Outcomes for Human Resources for Faculty for Campus Profiles for Applicant Profiles for SUNY Allocations and Expenditures

Where We Are Now !

Improving the repositories !

! !

! ! ! !

Enrollment: filling gaps, adding detail, adding metrics Degrees Granted Academic programs

“SMRT for Enrollment”: 23 views Preparing for campus demos and comments to perfect the tool. Input groups: AIRPO, ABB AIRPO sub committee?