Database Administration and Tuning semester

                                                                                         Department  of  Computer  Science  and  Engineering         ...
3 downloads 0 Views 334KB Size
                                                                                         Department  of  Computer  Science  and  Engineering          

             2013/2014  

                                                                       Database  Administration  and  Tuning                                  2nd   semester    

         Mini-­‐Project  1    

  Question  1  –  SQL  Server  Databases     Present  SQL  Server  T-­‐SQL  commands  for  accomplishing  the  following  tasks:         (a)   Create   a   database   named   SuperHeroDatabase,   containing   one   log   file   and   five   different   data   files,   in   three   distinct   filegroups   (i.e.,   one   data   file   in   the   primary   filegroup,   two   files   in   a   secondary   filegroup,   and   two   other   files   in   a   second   secondary   filegroup).   The   log   file   should   have   an   initial   size   of   25MB   and   a   maximum   size   of   100MB.   All   data   files   should   have   an   unlimited   maximum   size,   except   the   one   in   the   primary  filegroup,  which  should  have  a  maximum  size  of  1GB).  The  first  data  file  on  the   first   secondary   filegroup   should   have   an   initial   size   of   50MB,   and   the   remaining   files   should  have  an  initial  size  of  10MB.  All  files  should  grow  at  a  rate  of  50%,  except  for   the   data   file   in   the   primary   filegroup,   which   should   grow   by   5MB,   every   time   this   is   required.       (b)  Create  a  table  named  SuperHeroes  in  the  SuperHeroDatabase  database.  The  table   should  have  a  numeric  attribute  named  heroID,  that  identifies  the  individual  records,   an   alphanumeric   attribute   named   heroName,   and   two   other   numeric   attributes   named  heroStrength  and  heroSpeed.  The  table  should  be  partitioned  so  that  all  tuples   where   heroID   is   less   or   equal   than   10   are   physically   stored   in   the   primary   filegroup,   all   tuples  where  the  heroID  is  greater  than  10,  but  less  or  equal  than  100,  are  physically   stored   in   the   second   secondary   filegroup,   and   the   remaining   tuples   are   physically   stored  in  the  first  secondary  filegroup.     (c)  In  the  table  named  SuperHeroes,  each  hero’s  running/flying  speed  is  stored  in  an   attribute  named  heroSpeed,  in  terms  of  meters  per  second.  Create  an  index  over  the   table   named   SuperHeroes   with   a   search   key   corresponding   to   each   hero’s   speed   in   kilometers  per  second,  and  including  also  the  hero’s  strength  as  an  additional  attribute   that   is   not   part   of   the   search   key.   The   index   should   be   physically   stored   in   the   primary   filegroup.  Indicate  also  if  the  index  is  clustered  or  non-­‐clustered,  justifying.     Question  2  –  B+Tree  Index  Structures     Consider   the   B+-­‐tree   data   structure   introduced   in   the   theoretical   classes.   Show   the   result  of  inserting  the  following  keys,  in  the  given  order,  into  an  empty  B+-­‐tree  where   nodes  can  hold  up  to  4  values:     batman  ;  captain-­‐america  ;  constantine  ;  daredevil  ;  hulk  ;  ironman  ;  nova  ;  sandman  ;   spiderman   ;   superman   ;   swamp-­‐thing   ;   beast   ;   hawkman   ;   aquaman   ;   thor   ;   thing   ;   wolverine  ;  zatanna     (a)    Draw  the  tree  before  the  first  and  all  consecutive  split  operations.   IST/DEI  

 

Pág.  1  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados     (b)   Delete   the   following   keys   from   the   B+tree   data   structure   from   the   previous   exercise:   ironman   ;   captain-­‐america   ;   beast   ;   daredevil.   Draw   the   tree   after   each   deletion.     3  –  Extendable  Hashing  Index  Schemes     Consider   the   extendible   hashing   indexing   mechanism   introduced   in   the   theoretical   classes.  Show  how  data  records  with  the  following  keys  can  be  stored  in  buckets  which   individually  can  hold  two  records,  using  extendible  hashing  and  considering  the  least   significant  bits  first  in  the  directory  of  the  resulting  data  structure     ironman  ;  sandman  ;  batman  ;  thor  ;  hulk  ;  wolverine  ;  zatanna  ;  daredevil     Consider   that   binary   representations   for   keys   can   be   determined   by   logically   connecting   the   characters   (e.g.,   their   ASCII   encodings)   using   the   XOR   operator,   resulting  in:     ironman 11010011 sandman 11011101 batman 11010100 thor 11000010 hulk 11000101 wolverine 11010001 zatanna 11000010 daredevil 11011010     4  –  Hard  Disk  Characteristics  and  Accessing  Data  on  Hard  Disks     Consider  a  hard  disk  with  the  following  specifications,  and  where  the  outer  cylinders   have   double   the   density   than   the   inner   ones   (i.e.,   an   outer   cylinder   has   twice   the   number  of  blocks  than  an  inner  cylinder).   •

The  disk  has  4  platters,  and  2  surfaces  each  platter;  



The  disk  has  16,365  cylinders,  with  5455  inner  cylinders  and  10910  outer  cylinders   (the  density  is  be  different  in  these  two  types  of  cylinders);  



In  the  disk,  1  block  equals  a  total  of  512  bytes;  



The  disk  does  one  full  revolution  in  4.17  milliseconds;  



The   average   seek   time   (i.e.,   the   time   it   takes   the   head   assembly   on   the   actuator   arm   to   travel   to   the   track   of   the   disk   where   the   data   is   located)   is   of   12   milliseconds;  



The  total  usable  capacity  is  500  Gigabytes;  

IST/DEI  

 

Pág.  2  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados    

(a)  What  is  the  total  number  of  blocks  on  the  hard  disk?   (b)  How  many  of  the  blocks  are  on  the  inner  cylinders  of  the  disk?   (c)  How  many  blocks  are  on  the  outer  cylinders?  

 

(d)  On  the  inner  cylinders  of  the  disk,  how  many  blocks  are  on  each  track?

 

(e)  Once  the  head  arrives  at  the  beginning  of  an  inner  block,  how  much  time  does  it   take  to  read  a  block  off  the  disk?   (f)  What  is  the  expected  time  to  read  a  block  that  resides  on  an  inner  cylinder  of  the   disk?  Your  answer  should  include  the  three  types  of  delays  that  are  involved.     5  –  Estimating  the  Cost  of  Relational  Algebra  Operations     Consider  the  following  relational  schema:     Books (person-name, book-name) Characters(book-name, super-hero-name)   The  relation  Books  stores  information  about  the  books  owned  by  each  person,  and  the   relation  Characters  stores  information  about  the  super-­‐heroes  of  each  book.       All   tuples   have   fixed   size.   The   relation   Books   has   1500   pages   and   the   relation   Characters  has  2000  pages.  Each  page  of  Books  contains  100  tuples  and  each  page  of   Characters  50  tuples.     Compute  the  number  of  I/Os  performed  by  each  of  the  following  algorithms:     1. Selection   on   the   Characters   relation   where   the   filtering   condition   is   superhero-name = ‘Batman’,  assuming  there  is  no  index  on  the  table.   2. Nested  Loop  Join,  with  Books  as  the  outer  relation  and  the  join  condition  is  on   book-­‐name.   3. Sort-­‐Merge   Join,   assuming   that   the   relations   are   not   ordered   on   book-­‐name   and  that  you  can  have  3  pages  in  memory  when  sorting  the  relations.                    

IST/DEI  

 

Pág.  3  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados   Solution     1  -­‐  SQL  Server  Databases     Creating  database     CREATE DATABASE SuperHeroDatabase ON PRIMARY (NAME = DB_Primary_1, FILENAME="DB_Primary_F1.mdf", size=10MB, FILEGROWTH=5MB, MAXSIZE=1GB), FILEGROUP SECONDARY_FIRST (NAME = DB_Secondary_1_F1, FILENAME="DB_Secondary_1_F1.ndf", size=50MB, FILEGROWTH=50%), (NAME = DB_Secondary_1_F2, FILENAME="DB_Secondary_1_F2.ndf", size=10MB, FILEGROWTH=50%), FILEGROUP SECONDARY_SECOND (NAME = DB_Secondary_2_F1, FILENAME="DB_Secondary_2_F1.ndf", size=10MB, FILEGROWTH=50%), (NAME = DB_Secondary_2_F2, FILENAME="DB_Secondary_2_F2.ndf", size=10MB, FILEGROWTH=50%), LOG ON (NAME = DB_Log, FILENAME= "DB_Log.ldf", size=25MB, MAXSIZE=100MB, FILEGROWTH=50%);

Creating  Table   USE SuperHeroDatabase; CREATE PARTITION FUNCTION DB_Range1(INT) AS RANGE LEFT FOR VALUES (10,100); CREATE PARTITION SCHEME DB_PartScheme1 AS PARTITION DB_Range1 TO ([PRIMARY],SECONDARY_SECOND,SECONDARY_FIRST); CREATE TABLE SuperHeroes ( heroID INT NOT NULL, heroName VARCHAR(50), heroStrength INT, heroSpeed INT, CONSTRAINT CS1 PRIMARY KEY(heroID) ON DB_PartScheme1(heroID);

IST/DEI  

 

Pág.  4  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados   Creating  index   USE SuperHeroDatabase; ALTER TABLE SuperHeroes ADD heroSpeed2 as (heroSpeed / 1000); CREATE NONCLUSTERED INDEX IDX_V1 ON SuperHeroes (heroSpeed2) INCLUDE (heroStrength); ON ([PRIMARY]);

  2  –  B+Tree  Index  Structures     Insertion  into  B+Tree  

 

 

 

 

IST/DEI  

 

Pág.  5  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados  

 

      Removal  from  B+Tree          

         

 

             

3  –  Extendable  Hashing  Index  Schemes   IST/DEI  

 

Pág.  6  de  8  

 

Administração  e  Optimização  de  Bases  de  Dados    

  3  –  Hard  Disk  Characteristics  and  Accessing  Data  on  Hard  Disks     Answer  (a)   =  500  GB  /  512  B   =  536870912000  /  512   =  1048576000     =  500  *  221     Answer  (b)     IST/DEI  

 

Pág.  7  de  8  

 

 

Administração  e  Optimização  de  Bases  de  Dados   The  5455  inner  cylinders  =  1/5  of  the  disk,  since  (5455  *  4  *  2  +  10910  *  2  *  4  *  2)  =   218200  and  (218200  /  (5455  *  4  *  2))  =  5   =  1048576000  *  1/5   =  209715200   =  100  *  221     Answer  (c)     =  1048576000  -­‐  209715200   =  838860800   =  400  *  221     =  100  *  223     Answer  (d)     =  209715200  /  (4  *  2  *  5455)   ~=  4805       Answer  (e)     =  4.17  /  4805   =  0.00086774253  milliseconds     Answer  (f)     Seek  time  =  12   Half  Rotation  =  4.17  /  2  =  2.085   Transfer  =  0.00086774253   =  12  +  2.085  +  0.00086774253  =  14.0858677425     5  –  Estimating  the  Cost  of  Relational  Algebra  Operations     1. 2000  pages     2. n(Books)  *  n-­‐pages(Characters)  +  n-­‐pages(Characters)  =  100*1500*2000  +  1500   =  300001500  IOs       3. Cost  for  sorting  Characters  =  2000  *  (2  *    ⎡log2(2000/3)⎤  +  1)  =  42000   Cost  for  sorting  Books    =  1500  *  (2  *  ⎡log2(1500/3)  ⎤  +  1)  =  28500   Cost  for  join  (after  sorting)  =  2000  +  1500  =  4500   Total  cost  =  4500  +  42000  +  28500  =  74000  IOs  

IST/DEI  

 

Pág.  8  de  8