A database extension for ORACLE, INFORMIX and PostgreSQL. Compact, fast loads, quick queries. Barrodale Computing Services Ltd

A database extension for ORACLE, INFORMIX and PostgreSQL Compact, fast loads, quick queries Barrodale Computing Services Ltd. DBXten The topics add...
3 downloads 1 Views 817KB Size
A database extension for ORACLE, INFORMIX and PostgreSQL Compact, fast loads, quick queries Barrodale Computing Services Ltd.

DBXten

The topics addressed in this presentation are: ‣

What is DBXten?



Where can DBXten be used?



How does DBXten improve performance?



How do you use DBXten?



How effective is DBXten?

DBXten

DBXten is a database extension that: ‣

Speeds up data loading and querying



Shrinks data space requirements



Works with Oracle, Informix, PostgreSQL



Comes with C, Java & SQL APIs



Installs quickly

DBXten

Ideal DBXten application characteristics are: ‣

Data generated by instruments, sensors



High volume feeds



Write Once, Read Many times

DBXten OCEAN SENSOR ARRAY LATITUDE

LONGITUDE

DEPTH

TIME

SALINITY

TEMPERATURE

CONDUCTIVITY

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:42

4.23

6.21931

43675

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:43

4.23

6.21847

44532

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:44

4.23

6.21527

43658

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:45

4.23

6.21092

44085

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:46

4.23

6.21654

43969

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:47

4.23

6.21416

43862

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:48

4.23

6.21702

44498

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:49

4.24

6.21911

44114

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:50

4.24

6.21763

44049

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:51

4.24

6.21263

44525

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:52

4.24

6.21048

43635

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:53

4.28

6.21737

44346

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:54

4.29

6.21329

44355

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:55

4.31

6.21633

44412

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:56

4.33

6.21757

43710

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:57

4.33

6.21992

43600

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:58

4.33

6.21366

44273

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 12:59

4.33

6.21248

44453

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 13:00

4.33

6.21983

44214

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 13:01

4.33

6.21723

44321

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 13:02

4.33

6.21754

44310

48° 3’ 22” N

124° 21’ 08” W

628.41 m

2009.07.09 13:03

4.33

6.21652

44584

...

DBXten NASA FLIGHT R

Θ

Φ

TIME

ΔR

ΔΘ

ΔΦ

TEMPERATURE

PRESSURE ...

4711.75

79.12

24.79

91373

0.73

0.00

0.00

-27.683

0.4478

4712.50

79.12

24.79

91374

0.75

0.00

0.00

-27.705

0.4476

0.71

4713.21

79.12

24.79

91375

0.00

0.00

-27.727

0.4474

4713.98

79.12

24.79

91376

0.77

0.00

0.00

-27.736

0.4472

4714.73

79.12

24.79

91377

0.75

0.00

0.00

-27.775

0.4470

0.76

4715.49

79.12

24.79

91378

0.00

0.00

-27.871

0.4468

4716.26

79.12

24.79

91379

0.77

0.00

0.00

-27.966

0.4466

4716.98

79.12

24.79

91380

0.72

0.00

0.00

-28.005

0.4464

0.79

4717.77

79.12

24.79

91381

0.00

0.00

-28.032

0.4462

4718.56

79.12

24.79

91382

0.79

0.00

0.00

-28.102

0.4460

4719.35

79.12

24.79

91383

0.79

0.00

0.00

-28.131

0.4458

0.75

4720.10

79.12

24.79

91384

0.00

0.00

-28.209

0.4456

4720.82

79.12

24.79

91385

0.72

0.00

0.00

-28.288

0.4454

4721.56

79.12

24.79

91386

0.74

0.00

0.00

-28.331

0.4452

0.80

4722.36

79.12

24.79

91387

0.00

0.00

-28.360

0.4450

4723.08

79.12

24.79

91388

0.72

0.00

0.00

-28.380

0.4448

4723.83

79.12

24.79

91389

0.75

0.00

0.00

-28.382

0.4446

4724.56

79.12

24.79

91390

0.73

0.00

0.00

-28.402

0.4444

4725.31

79.12

24.79

91391

0.75

0.00

0.00

-28.501

0.4442

4726.10

79.12

24.79

91392

0.79

0.00

0.00

-28.526

0.4440

0.00

0.00

-28.608

0.4438

0.00

0.00

-28.622

0.4436

4726.84

79.12

24.79

91393

0.74

4727.59

79.12

24.79

91394

0.75

DBXten NUCLEAR REACTOR Time

Valve

1986.04.26 03:22:04

Open

Open

1986.04.26 03:22:05

Open

Open

1

. . . Valve

q

P1

...

...

Tn

Neutron Density 1

...

Neutron Density p

Pm

T1

4.27

3.62

727

662

7351

4178

4.28

3.62

732

665

7364

4178

721

666

7336

4178

1986.04.26 03:22:06

Open

Open

4.28

3.62

1986.04.26 03:22:07

Open

Open

4.29

3.62

728

667

7379

4178

1986.04.26 03:22:08

Open

Open

4.28

3.62

724

672

7335

4178

727

679

7390

4178

1986.04.26 03:22:09

Open

Open

4.26

3.62

1986.04.26 03:22:10

Open

Open

4.24

3.62

726

681

7341

4178

1986.04.26 03:22:11

Open

Open

4.27

3.62

730

682

7340

4178

720

683

7331

4178

1986.04.26 03:22:12

Open

Open

4.26

3.62

1986.04.26 03:22:13

Open

Open

4.31

3.62

723

685

7339

4178

1986.04.26 03:22:14

Open

Closed

4.31

3.62

728

692

7366

4178

727

728

7370

4186

1986.04.26 03:22:15

Open

Closed

4.29

4.16

1986.04.26 03:22:16

Open

Closed

4.32

4.53

731

754

7377

4237

1986.04.26 03:22:17

Open

Closed

4.32

5.02

723

769

7370

4272

732

795

7374

4309

1986.04.26 03:22:18

Open

Closed

4.35

5.23

1986.04.26 03:22:19

Open

Closed

4.28

5.32

723

837

7369

4348

1986.04.26 03:22:20

Open

Closed

4.35

5.67

730

881

7366

4370

729

928

7328

4417

1986.04.26 03:22:21

Open

Closed

4.27

5.88

1986.04.26 03:22:22

Open

Closed

4.28

6.27

725

964

7364

4427

1986.04.26 03:22:23

Open

Closed

4.29

6.62

730

983

7339

4438 4471 4528

1986.04.26 03:22:24

Open

Closed

4.31

6.91

720

1037

7346

1986.04.26 03:22:25

Open

Closed

4.28

7.09

722

1062

7371

...

DBXten NATURE OF THE DATA ‣

Sampling rate chosen not to miss “interesting” events



Repetitive or simply patterned



Low information density

Great candidate for compression

DBXten BENEFITS OF COMPRESSION

E K E WE

S D N

Less Happier Less tuning, tuning, Happier administration DBAs administration DBAs Faster Faster loads loads Smaller Smaller tables tables

Happier Happier users users

Faster Faster queries queries

Smaller Smaller indexes indexes

Less Less I/O I/O

DBXten COST OF COMPRESSION

CPU CPU cycles cycles

DBXten STANDARD INFORMIX COMPRESSION

Row

Name

Dept

Salary

City

State

1

Fred Smith 500

10000 Raleigh

NC

2

John Smith 500

20000 Raleigh

NC

Symbol Row

Data stored on disk

1

01

02

03

10000

04

2

05

02

03

20000

04

Pattern

01

Fred

02

Smith

03

500

04

Raleigh NC

05

John

DBXten STANDARD ORACLE COMPRESSION Row

Name

Dept

Salary

City

State

1

Fred Smith 500

10000 Raleigh

NC

2

John Smith 500

20000 Raleigh

NC

Symbol

Row

Data stored on disk

1

01

02

03

10000

04

05

2

06

02

03

20000

04

05

Pattern

01

Fred

02

Smith

03

500

04

Raleigh

05

NC

06

John

DBXten DBXten COLUMN COMPRESSION SALINITY 4.23 4.23

SALINITY

7

×

4.23

4

×

4.24

4.23 4.23 4.23 4.23

4.28

4.23 4.24 4.24

4.29

4.24 4.24 4.28

4.31

4.29 4.31 4.33 4.33 4.33 4.33 4.33 4.33 4.33 4.33

8

×

4.33

DBXten SEQUENCES & PATTERNS 1, 3, 5, 7, 9, 11, 13 ...

Arithmetic Sequence Vi = V0 + Δ × i

1, 3, 5, 7, 9, 1, 3, 5, 7, 9, 1, 3, 5, ...

Arithmetic Cycle Vi = V0 + Δ × (i + offset) mod len

4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, ...

Run Length V = n rep V0

9, 16, 7, 8, 9, 16, 7, 8, 9, 16, 7, 8, ...

Repeated Sequence Vi = Si mod len

DBXten DIFFERENCES

7, 8, 11, 16, 17, 20, 25, 26, 29 ... 7 7 7 7 7 7 ...

+ + + + +

1 1 1 1 1

Repeated Sequence of differences Δ Vi = Si mod len

+ + + +

3 3 + 5 3 + 5 + 1 3 + 5 + 1 + 3

Very useful for expressing timestamps (usually floating point values representing the number of seconds since Jan 1, 1970, UTC). For example, 4pm every Tuesday and 6pm every Thursday.

DBXten FLOATING POINT Sign

Exponent

Mantissa

‣ Very wide range of values ‣ Storing excessive precision is wasteful ‣ Rounding to a user-declared precision gives new opportunities for compression

DBXten FLOATING POINT USER DECLARED PRECISION

64 bit

16 bit

DBXten FLOATING POINT USER DECLARED PRECISION 64 bit doubles 4711.75452 4711.75503 4711.75764 4711.76013 4711.76335 4711.76435 4711.76626 4711.76782 4711.76864 4711.76876 4711.77237 4711.77408 4711.77465 4711.77844 4711.78009 4711.78062 4711.78417 4711.78454

Precision .01 471175 471176 471176 471176 471176 471176 471177 471177 471177 471177 471177 471177 471177 471178 471178 471178 471178 471178

Run length encoded 471175 5

471176

7

471177

5

471178

DBXten

US PATENT APPLICATION 11/779,791 JULY 21, 2006

DBXten LOADING

‣ One column of “chips” ‣ Self-describing, includes schema ‣ Portable chips ‣ Table in table functionality

DBXten IN OTHER WORDS… Instrument Id Block 1:

Block 2:

Block 3:

1 1 1 1 1 1 1 1

Datetime 2008-02-01 00:12:23 2008-02-01 00:12:25 2008-02-01 00:12:27 2008-02-01 00:12:29 2008-02-01 00:12:31 2008-02-01 00:12:33 … 2008-02-01 00:13:43

Latitude 46.343 46.344 46.345 46.346 46.347 46.349 … 46.392

Longitude -127.386 -127.385 -127.383 -127.382 -127.381 -127.379 … -127.335

Depth 14.34 16.82 18.85 21.22 23.66 26.05 … 104.82

1 1 1 1 1 1 1 1

2008-02-01 00:13:45 2008-02-01 00:13:47 2008-02-01 00:13:49 2008-02-01 00:13:51 2008-02-01 00:13:53 2008-02-01 00:13:55 … 2008-02-01 00:14:59

46.394 46.395 46.396 46.398 46.399 46.400 … 46.439

-127.334 -127.332 -127.331 -127.330 -127.328 -127.327 … -127.289

106.83 108.90 110.99 113.32 115.38 117.65 … 188.40

1 1 1 1 1 1 1 1

2008-02-01 00:15:01 2008-02-01 00:15:03 2008-02-01 00:15:05 2008-02-01 00:15:07 2008-02-01 00:15:09 2008-02-01 00:15:11 … 2008-02-01 00:16:15

46.441 46.442 46.443 46.444 46.446 46.447 … 46.486

-127.287 -127.286 -127.285 -127.283 -127.282 -127.281 … -127.241

190.88 193.22 195.65 197.86 200.02 202.38 … 274.68

Measurement 10.2 11.9 10.7 11.7 10.9 11.4 … 10.7

10.7 13.1 10.7 11.4 10.2 10.9 … 10.3

9.7 11.9 11.5 10.5 11.2 10.7 … 11.3

Instrument Id 1 1 1

Chip Chip 1 Chip 2 Chip 3

DBXten SUPPLIED LOADERS



CSV Loader



NetCDF Loader



Draw and Load (DaL)

DBXten INDEXING

Bounds Bounds extraction extraction

Min

Max

Min

Max

Min

Max

DBXten INDEXING

CREATE TABLE measurements (id serial primary key, chip DSChip);

CREATE INDEX measurements_idx ON measurements(bounds(chip) dsbox_ops) USING rtree;

DBXten RETRIEVAL

Chip Chip selection selection

Tuple Tuple extraction extraction

DBXten RETRIEVAL

EXECUTE FUNCTION DSQueryToStrings( "SELECT id::integer, DSChipExtract(chip, 'dt 2008-02-01 00:13:00 2008-02-01 00:14:00, lat * *, lon * *', 'dt,value') FROM measurements WHERE overlap( DSAsBoxString(chip,'datetime,lat,lon')::DSBox, DSRangeToBox('datetime 2008-02-01 00:13:00 2008-02-01 00:14:00, lat * *,lon * *')::DSBox)" );

DBXten INFORMIX VIRTUAL TABLE INTERFACE create table measurements_VTI ( id integer, dt datetime year to fraction (4), lat double precision, lon double precision, value integer) using DSChipAccess( basetable='measurements', keylist='bounds(chip):(dt,lat,long)'); select id, dt, lat, lon, value from measurements_VTI where lat .... and lon .... and dt ....;

DBXten BENCHMARK



3 GHz Intel Pentium D, 2 GB RAM OS: Fedora Core 7



Loaded 25 million CSV records prepared from NOAA Sea Surface Temperature data, supplied via a UNIX pipe



Four queries run against 50 million records

DBXten BENCHMARK

Conventional

With DBXten

ORACLE

INFORMIX

SQL*Loader

High Performance Loader

Direct mode with compression

IDS compression

CSV Chip Loader tool

CSV Chip Loader tool

DBXten BENCHMARK Unindexed

Indexed

Space

Load

Space

MB

secs

MB

1,894

2,836

848

Load

Query

secs

secs

3,172

3,574

205

1,410

860

1,576

85

2,024

3,036

2,556

4,069

176

733

1,318

736

1,367

23

ORACLE Conventional with DBXten INFORMIX Conventional with DBXten

DBXten PERFORMANCE IMPROVEMENT WITH DBXten Unindexed

Indexed

Space

Load

Space

Load

ORACLE

2.2

2.0

3.7

2.3

2.4

INFORMIX

2.8

2.3

3.5

3.0

7.7

Query

DBXten SUMMARY

‣ ‣

2 or 3 times improvement over vendor compression techniques C, Java & SQL APIs



Fully documented

A database extension for INFORMIX, ORACLE and PostgreSQL www.barrodale.com [email protected]