Managed Services Cloud Services Consul3ng Services Licensing
Using Exadata Flash Disks to Speed up Joins and Sorts Kasey Parker Enterprise Architect
[email protected] • Managed Services • Cloud Services • Consul3ng Services • Licensing
Who is Centroid? QUICK FACTS § Centroid is a leading provider of Oracle Technology, Applica8ons and Infrastructure/Hos8ng solu8ons § Oracle Pla8num Partner • Oracle Excellence Award: 2014 Engineered Systems Partner of the Year • Selected to Oracle’s Top 25 Strategic Partner Program • Top 5 Oracle Partner for Hardware/Storage
§ Established in 1997 § Office loca8ons: Troy, MI (HQ); San Francisco, CA; Los Angeles, CA; Dallas, TX § 200+ Consultants § “Clients for life” approach to customer rela8onships § Oracle Exadata Center of Excellence established in 2011 • Centroid Authored -‐ Oracle Exadata Recipes (Published Feb-‐2013) • Managed Services • Cloud Services • Consul3ng Services • Licensing
Agenda § Exadata Overview § Why Exadata? § Exadata Flash § Smart Flash Cache § Flash-‐based Grid disks
§ Using Flash for Temp § Risk/Reward § How to set it up
§ Q&A • Managed Services • Cloud Services • Consul3ng Services • Licensing
EXADATA OVERVIEW
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Architecture Database hardware and soOware plaPorm “in a box” Scale-‐Out Database Servers • 8x 2-‐socket, or 2x 8-‐socket Xeon database servers • Oracle Database, ASM, RAC; Linux or Solaris • Standard Ethernet to data center Scale-‐Out Intelligent Storage Servers • 2-‐socket storage servers, Exadata Storage SoOware • Up to 672 terabytes disk per rack • 56 PCI Flash memory cards per rack InfiniBand Network • Unified internal connec3vity ( 40 Gb/sec )
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Configura3on Op3ons Start small and grow as needed – upgraded onsite
Eighth Rack X4-‐2
Quarter Rack X4-‐2
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Half Rack X4-‐2
Full Rack X4-‐2 X4-‐8
Exadata Hardware Summary X4-2 Full
X4-2 Half
X4-2 Quarter
X4-2 Eighth
8
4
2
2
192
96
48
24
2048 (max 4096)
1024 (max 2048)
512 (max 1024)
512 (max 1024)
InfiniBand switches
2
2
2
2
Ethernet switch
1
1
1
1
Exadata Storage Servers
14
7
3
3
Storage Grid CPU Cores
168
84
36
18
44.8 TB
22.4 TB
9.6 TB
4.8 TB
High Perf
200 TB
100 TB
43.2 TB
21.6 TB
High Cap
672 TB
336 TB
144 TB
72 TB
High Perf
90 TB
45 TB
19 TB
9 TB
High Cap
300 TB
150 TB
63 TB
30 TB
High Perf
60 TB
30 TB
13 TB
6.3 TB
High Cap
200 TB
100 TB
43 TB
21.5 TB
Database Servers Database Grid Cores Database Grid Memory (GB)
Raw Flash Capacity Raw Storage Capacity
Usable mirrored capacity
Usable Triple mirrored capacity
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Hardware Summary X4-2 Full
X4-2 Half
X4-2 Quarter
X4-2 Eighth
8
4
2
2
192
96
48
24
2048 (max 4096)
1024 (max 2048)
512 (max 1024)
512 (max 1024)
InfiniBand switches
2
2
2
2
Ethernet switch
1
1
1
1
Exadata Storage Servers
14
7
3
3
Storage Grid CPU Cores
168
84
36
18
44.8 TB
22.4 TB
9.6 TB
4.8 TB
High Perf
200 TB
100 TB
43.2 TB
21.6 TB
High Cap
672 TB
336 TB
144 TB
72 TB
High Perf
90 TB
45 TB
19 TB
9 TB
High Cap
300 TB
150 TB
63 TB
30 TB
High Perf
60 TB
30 TB
13 TB
6.3 TB
High Cap
200 TB
100 TB
43 TB
21.5 TB
Database Servers Database Grid Cores Database Grid Memory (GB)
Raw Flash Capacity Raw Storage Capacity
Usable mirrored capacity
Usable Triple mirrored capacity
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Hardware Exadata X4-‐2 SQL IO Performance Flash Cache SQL Bandwidth1,3 Flash SQL IOPS2,3 Disk SQL Bandwidth1,3 Disk SQL IOPS Data Load Rate4
X4-2 Full Rack
X4-2 Half Rack
X4-2 Quarter
X4-2 Eighth
High Cap Disk
100 GB/s
50 GB/s
21.5 GB/s
10.7 GB/s
High Perf Disk
100 GB/s
50 GB/s
21.5 GB/s
10.7 GB/s
8K Reads
2,660,000
1,330,000
570,000
285,000
8K Writes
1,960,000
980,000
420,000
210,000
High Cap Disk
20 GB/s
10 GB/s
4.5 G/s
2.25 GB/s
High Perf Disk
24 GB/s
12 GB/s
5.2 GB/s
2.6 GB/s
High Cap Disk
32,000
16,000
7,000
3,500
High Perf Disk
50,000
25,000
10,800
5,400
20 TB/hr
10 TB/hr
5 TB/hr
2.5 TB/hr
1 -‐ Bandwidth is peak physical scan bandwidth achieved running SQL, assuming no compression. Effec3ve data bandwidth will be much higher when compression is factored in. 2 -‐ IOPS – Based on read IO requests of size 8K running SQL, typically with sub-‐millisecond latencies. Note that the IO size greatly effects flash IOPS. Others quote IOPS based on 2K, 4K or smaller IOs that are not relevant for databases and measure IOs using low level tools instead of SQL. 3-‐ Actual Performance varies by applica3on. 4 –Load rates are typically limited by database server CPU, not IO. Rates vary based on load method, indexes, data types, compression, and par33oning
• Managed Services • Cloud Services • Consul3ng Services • Licensing
WHY EXADATA?
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Exadata is designed to eliminate the most common bomleneck for large databases…
Timely transfer of large data sets from storage subsystem to database server • Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Solving the IO BoUleneck Solu3on 1: Enlarge the pipe
• Physical disks, on all cells, work in parallel to serve IO requests • Large Infiniband pipe (40GB/Sec) • Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Can’t we do that with other high performance storage soluWons?
YES… There is nothing Magical about Exadata hardware, and it’s s3ll the same Oracle Database • Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Exadata’s Secret Sauce Hybrid Columnar Compression
Intelligent storage
– 10x compression for warehouses
– Scale-out InfiniBand storage – Smart Scan query offload
+
+
– 15x compression for archives
uncompressed
+
Data remains compressed for scans and in Flash
Smart PCI Flash Cache – Accelerates random I/O up to 30x – Triples data scan rate
Benefits Cascade to Copies
• Managed Services • Cloud Services • Consul3ng Services • Licensing
compress primary DB
standby
test
dev
backup
Why Exadata? Solving the IO BoUleneck Solu3on 2: Reduce the IO opera3ons
• Done using Exadata’s Secret Sauce: Storage Offloading, Smart Flash Cache and Hybrid Columnar Compression • 10X reduc3on in data sent to database servers is common • Managed Services • Cloud Services • Consul3ng Services • Licensing
EXADATA FLASH
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Flash Cache I/Os
• Caches Read and Write I/Os in PCI flash • Transparently accelerates read and write intensive workloads – Up to 2.66 million 8K read IOPS from SQL – Up to 1.96 million 8K write IOPS from SQL
2.66 Million 8K Read 1.96 Million 8K Write IOPS from SQL
• Persistent write cache speeds database recovery • Exadata Flash Cache is much more effec3ve than flash 3ering architectures used by others – Caches current hot data, not yesterday’s – Caches data in granules 8x to 16x smaller than 3ering • Greatly improves the effec3veness of flash
Other Flash Features can be configured if needed E.g. Cache compression, Cache pinning, Flash Disks (for Temp) • Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Flash Cache I/Os
• Caches Read and Write I/Os in PCI flash • Transparently accelerates read and write intensive workloads – Up to 2.66 million 8K read IOPS from SQL – Up to 1.96 million 8K write IOPS from SQL
2.66 Million 8K Read 1.96 Million 8K Write IOPS from SQL
• Persistent write cache speeds database recovery • Exadata Flash Cache is much more effec3ve than flash 3ering architectures used by others – Caches current hot data, not yesterday’s – Caches data in granules 8x to 16x smaller than 3ering • Greatly improves the effec3veness of flash
Other Flash Features can be configured if needed E.g. Cache compression, Cache pinning, Flash Disks (for Temp) • Managed Services • Cloud Services • Consul3ng Services • Licensing
Flash Based Cell Disks Usage • Smart Flash Cache – Uses all available space by default – Managed automa3cally for maximum efficiency
• Flash-‐based Grid Disks – Premium, persistent database storage – Requires deliberate planning for efficient usage – One poten3al use case is for temp tablespace
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Use Flash for Temp? • Even well tuned data warehouse environments oOen have significant temp I/O – Par3cularly related to large hash joins
• Large quan3ty of Exadata flash provides a great mechanism to offload IOPS from hard disks and improve temp I/O performance • Makes even more sense with flash cache compression and more flash on newer Exadatas • By default Exadata does nothing to speed up temp I/O – No storage cell offloading for temp – Flash cache is not used for temp opera3ons
• Must use flash-‐based grid disks to use flash for temp opera3ons • Managed Services • Cloud Services • Consul3ng Services • Licensing
Risks Evaluate the trade-‐offs and determine if flash-‐based grid disks are right for your environment • Reduced flash cache size – More impacPul for OLTP workloads than DW workloads
• Redundancy requirements for temp tablespace – External redundancy carries availability risk – even for temp – Normal redundancy requires using double the amount of flash
• May require addi3onal maintenance during patching • Does your database even have a lot of temp I/O?
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Reward Case Study:
• Large organiza3on in Salt Lake City Area • Data Warehouse running on Exadata X2-‐2 ¼ rack (1TB flash) • Temp I/O significant bomleneck • Temp read and write I/O 4th and 5th top wait events on DB
• Significant performance improvement aOer moving temp tablespace to 340GB flash disk
• Dropped temp I/O out of the top 10 wait events • Temp I/O latency reduced 8X • Temp heavy SQL saw average of 3X performance improvement
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Temp Tablespace on Flash 1) 2) 3) 4) 5) 6) 7)
Determine size of flash-‐based grid disk Drop Flash Cache Recreate Flash Cache at new reduced size Create the flash grid disk Create ASM Diskgroup on the flash grid disk Create temporary tablespace on the new diskgroup Alter users to use the new temp tablespace
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks Find Current Celldisk detail using CellCLI: CELLCLI> list celldisk detail name: CD_00_cm01celadm01 diskType: HardDisk size: 528.734375G ... (12 total hard disks disks on a cell) name: FD_00_cm01celadm01 diskType: FlashDisk size: 22.875G ... (15 total flash disks on a cell)
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks Find Current Flashcache detail: CELLCLI> list flashcache detail name: cm01celadm01_FLASHCACHE size: 364.75G
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks Note: If keeping databases online do all steps to create the flash-‐based grid disk one cell at a 3me, otherwise could do cells in parallel. 1) Calculate the new flash cache and flash grid disk sizes: •
E.g. 100GB flash grid disk on X2-‐2 ¼ rack
• • •
364.75 (original flash cache) * 3 (cells) -‐ 100 (flash disk) = 994.25GB 994.25GB available for flash cache Divide by number of cells (e.g. 3) = 331.417GB flashcache per cell
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks 2) Drop Flash Cache:
• If keeping databases online do one cell at a 3me, otherwise could do in parallel • If write-‐back flash is enabled will need to flush first •
cellcli > alter flashcache all flush
oracle@cm01dbadm01 ~]$ dcli -c cm01celadm01 cellcli -e drop flashcache cm01celadm01: Flash cache cm01celadm01_FLASHCACHE successfully dropped
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks 3) Create Flashcache at new size:
• Total flash minus the size of the new grid disk
[oracle@cm01dbadm01 ~]$ dcli -c cm01celadm01 cellcli -e create flashcache all size=331.4167g cm01celadm01: Flash cache cm01celadm01_FLASHCACHE successfully created
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks 4) Create new Flash-‐based Grid Disks [oracle@cm01dbadm01 ~]$ dcli -c cm01celadm01 cellcli -e create griddisk all flashdisk prefix=flash cm01celadm01: GridDisk flash_FD_00_cm01celadm01 successfully created cm01celadm01: GridDisk flash_FD_02_cm01celadm01 successfully created … cm01celadm01: GridDisk flash_FD_15_cm01celadm01 successfully created • Managed Services • Cloud Services • Consul3ng Services • Licensing
Crea3ng Flash-‐based Grid Disks Repeat Steps 2-‐4 for other storage cells • E.g. • • • • • •
dcli -‐c cm01celadm02 cellcli -‐e drop flashcache dcli -‐c cm01celadm02 cellcli -‐e create flashcache all size=331.4167g dcli -‐c cm01celadm02 cellcli -‐e create griddisk all flashdisk prefix=flash dcli -‐c cm01celadm03 cellcli -‐e drop flashcache dcli -‐c cm01celadm03 cellcli -‐e create flashcache all size=331.4167g dcli -‐c cm01celadm03 cellcli -‐e create griddisk all flashdisk prefix=flash
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Create Flash-‐based ASM Diskgroup 1) Login via SQLPlus to ASM instance as sysasm 2) Create new diskgroup on the flash grid disk SQL> create diskgroup FLASH_TEMP normal redundancy disk 'o/*/flash*' attribute 'compatible.rdbms'='11.2.0.2.0', 'compatible.asm'='11.2.0.4.0', 'cell.smart_scan_capable'='TRUE', 'au_size'='4M'; Diskgroup created. • Managed Services • Cloud Services • Consul3ng Services • Licensing
Create Flash-‐based ASM Diskgroup 3) Mount new diskgroup on all ASM instances
• First, login to each ASM instance and verify state • Should already be mounted on instance created on
SELECT GROUP_NUMBER AS GRP_NUM, NAME, STATE, TOTAL_MB, FREE_MB, USABLE_FILE_MB, ROUND((CASE WHEN (TOTAL_MB != 0) THEN FREE_MB / TOTAL_MB ELSE 0 END), 2)*100 || '%' PERCENT_FREE FROM V$ASM_DISKGROUP ORDER BY 1; • Managed Services • Cloud Services • Consul3ng Services • Licensing
Create Flash-‐based ASM Diskgroup SQL> SELECT GROUP_NUMBER AS GRP_NUM, NAME, STATE, TOTAL_MB, FREE_MB FROM V$ASM_DISKGROUP ORDER BY 1; GRP_NUM ------0 1 2 3
NAME -----------FLASH_TEMP DATAC1 DBFS_DG RECOC1
STATE TOTAL_MB FREE_MB ------------ ----------- ----------DISMOUNTED 0 0 MOUNTED 7815168 7448472 MOUNTED 894720 893344 MOUNTED 11674368 11550648
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Create Flash-‐based ASM Diskgroup SQL> alter diskgroup FLASH_TEMP mount; Diskgroup altered. SQL> SELECT GROUP_NUMBER AS GRP_NUM, NAME, STATE, TOTAL_MB, FREE_MB FROM V$ASM_DISKGROUP ORDER BY 1; GRP_NUM ------1 2 3 4
NAME -----------DATAC1 DBFS_DG RECOC1 FLASH_TEMP
STATE TOTAL_MB FREE_MB ------------ ----------- ----------MOUNTED 7815168 7448472 MOUNTED 894720 893344 MOUNTED 11674368 11550648 MOUNTED 102912 102168
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Create Flash Temp Tablespace SQL> CREATE TEMPORARY TABLESPACE FLASH_TEMP TEMPFILE '+FLASH_TEMP' SIZE 20480M AUTOEXTEND ON NEXT 4096M MAXSIZE 20480M, '+FLASH_TEMP' SIZE 20480M AUTOEXTEND ON NEXT 4096M MAXSIZE 20480M TABLESPACE GROUP '' EXTENT MANAGEMENT LOCAL UNIFORM SIZE 64M / Tablespace created.
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Move Users to Temp Tablespace SQL> ALTER USER SH TEMPORARY TABLESPACE FLASH_TEMP; User altered.
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Conclusion When to use Exadata Flash for Temp tablespace? • IO bomlenecked • Significant por3on of IO is from Temp opera3ons • Typically suited more for DW than OLTP workloads – Because DW workloads typically don’t use flash cache as much and drive larger temp opera3ons
• Newer Exadata versions – have more flash TEMP
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Ques3ons?
• Managed Services • Cloud Services • Consul3ng Services • Licensing