Data Warehousing for the SUNY System AIRPO, Winter 2006 Maggie Moehringer
[email protected]
SUNY “Data Warehouse” !
!
! !
!
A collection of data repositories (files, tables), with data geared to different functional and data usage needs Interrelated (or interrelate-able) at some level of data summarization and time slicing Read only May contain transactional detail but do not directly support transaction processing Optimized for self service for analysts and knowledge workers who need to create or execute queries/inquiries
SUNY Information Environment Components ! ! ! ! !
The information audience The data itself The data repositories Access: Tools to get to the data The “Plan”
Our Information Audiences !
Indirect users: ! ! ! ! ! !
!
Prospective students and parents The interested public Media NYS Senate and Assembly NYS executive/agencies (Governor, DoB, etc.) SUNY Board
Direct Users: SUNY System and campus functional office and analytical/planning staff
Direct “Hands On” Audience for the Data Warehouse !
System and campus functional offices !
Administrative/Operational Data Usage: !
!
Detailed, low level granular, current and historical, transactional; within a function.
System and campus analytical staff !
Analytical / Planning Data Usage: !
Longitudinal, comparative, cohort, statistical and projective purposes; cross functional; detailed, not transactional; stable time slice
Data: What We Have… and Don’t Have ! !
“We” = SUNY analytical staff Employees: ! ! !
!
Applicants/Applications: ! !
!
at State operated campuses…. but not at community colleges and not everyone who provides instruction. for ASC participants, but not all applications… and not non-participating campuses.
Student/applicant socio-economic and financial aid: !
None.
Data: What We Have… and Don’t Have (cont’d) !
Funding: ! !
!
Enrollment: ! ! ! !
!
that flows through state accounts… but not funding that flows through RF, CF or local campus accounts. as of the census date… but not changes in student enrollment after that, and not some populations that are funded and not unfunded activity.
Instructional activity/cost/workload: ! !
for the State Operated campuses… but not for Community Colleges.
Data: Major Frustrations !
!
! !
Production information systems often do not include the complete information necessary to support management inquiries and decision making. Knowledge workers are forced to bring together data from different sources, summary levels, and time slices, and must be very knowledgeable about data shortcomings. “Yes, we kind of have that info, but…” Hard to allow unfettered access, but we must figure out a way.
Data Repositories: Current Technology ! !
!
! !
Old legacy production systems New Oracle relational versions of old legacy data New Oracle star schema versions of old legacy data New Oracle systems Spreadsheets, summary data feeds, special compilations, etc.
Data Repositories: Future Technology !
!
Oracle instances supporting transactional systems and functional operations Oracle instances supporting reporting: ! !
Relational data bases Dimensional data bases
Future Repository Design: Getting our Staff There ! !
! !
!
!
!
1999: Short information gathering project. Technical staff: training for two people on data warehousing, dimensional modeling. Training for two staff on Oracle Warehouse Builder. 2001: One star in an area with good data (SDF Enrollment). Then two more stars (ASC Applicants, State Employees). Training for users and technical staff on a query tool (Oracle Discoverer). Refinement of extract, transformation and load (ETL) procedures.
Data Repository Design: DW Expertise on Campuses ! ! !
“Banner Reporting Initiative” Survey Expertise deficit on campuses. Ways to improve it: ! ! ! ! !
Some training Oracle tools Using what we have Collaborative assistance Possible product acquisition.
Tools to Access Data !
Major consideration: the security environment at System Administration " " " " "
"
"
UserID/Password Secured Web access is “portal” driven Web clients for most users Single sign-on Distributed maintenance of identification/authorization information Therefore, access to SUNY systems by client tools (Access, Cognos, etc.) with internal security that must be centrally maintained is an administrative issue Access to SUNY systems by clients tools is a support issue.
Tools: the Possibilities !
! ! ! !
!
At the simplest level, web pages can display pre-formatted data (HTML, PDFs, etc.); not enough. Custom Inquiries Canned Queries Distributed Datasets for static data Query or analytical tool in local use with downloaded data Direct query access.
Tools: Probabilities for Campus Access !
Custom inquiries, developed in Cold Fusion or Java (e.g. current SMRT for Finance) ! !
!
Canned queries (Discoverer) !
!
!
Developing “SMRT for Enrollment” Can be smarter than a dumb query Optimized, parameter driven for flexibility
Distributed Reports and Datasets for static data If it’s necessary, query access.
The Plan: The SMRT Environment ! !
! !
“SUNY Management Reporting Tool” S-M-R-T was intended for use as a general acronym. “The SMRT Portal” “SMRT for Enrollment”, “SMRT for Human Resources”, “SMRT for Academic Programs”…..
What belongs in the SMRT inquiry environment? !
! ! ! ! ! !
Designed to address this problem: “I can’t allow them to have access to my data because they don’t understand the data, they don’t know how to ask the question, they might make a mistake.” Guided, mistake-proof, supported by metadata, always inquiry only, and An inquiry that’s useful for users who are not working in the specific business area, OR An inquiry that’s useful for a broader audience than the specific business area user, and often A higher level inquiry than the most granular level of detail, and often Geared to users who are likely to want to see reporting out of multiple business areas OR Users who will not be using the transactional and update capabilities of the business application.
Measures of Success for SMRTs ! ! ! ! !
QUICKLY developed East to change, enhance Cover most of the need Easy to use Impossible to misinterpret the data.
SMRT Development Process ! ! ! ! !
!
Input at the System level Development of basic views Review with campus interest groups Enhancement and deployment Provision of “gap filling” queries and reports. Ongoing assessment and improvement with campus and system user groups.
DW/Reports User Interface Environment Facts Data Mart (non-Web)
SUNY.edu
Fast Facts
Employee Portal
Publicly Accessible Inquiries
Business Area Apps
SMRT Inquiries
Business area specific inquiries and output
Common Facilities
Business Applications DW facilities
Discoverer Viewer: Canned BA Query Output
Metadata UI Color Key:
SMRT Portal
Canned DW Query Portal and Query Output
Data Policies & Procedures Legacy Reports (temporary)
User SMRT/DW Documentation Interim Metadata
Candidate SMRT Inquiries ! !
Don’t wait for the perfect systems and DW; use what we have Pockets of readiness: HR, Enrollment, Academic Programs ! !
!
BUT serious data vacuums AND questions about SUNY wide access
SMRT for Enrollment: Helen Ernst – Technical Lead !
Of use to: ! ! ! !
! ! ! !
Budget analysts Enrollment managers Institutional researchers Executive management
SUNY wide data Requirements for the System office views defined Requirement for campus views needed Common features to all SMRTs: printer friendly version, Excel downloads, etc.
Other “SMRTs” !
SMRT for Academic Programs ! !
! ! ! ! ! ! !
Enrollment, degrees granted AND, through relationships, costs, staffing, ...
SMRT SMRT SMRT SMRT SMRT SMRT etc.
for Student Outcomes for Human Resources for Faculty for Campus Profiles for Applicant Profiles for SUNY Allocations and Expenditures
Where We Are Now !
Improving the repositories !
! !
! ! ! !
Enrollment: filling gaps, adding detail, adding metrics Degrees Granted Academic programs
“SMRT for Enrollment”: 23 views Preparing for campus demos and comments to perfect the tool. Input groups: AIRPO, ABB AIRPO sub committee?