Name service
Domain Name System (DNS)
Name : identifier computers, services, remote objects, files, users, … …. a fundamental component in distributed systems helps communication and resource sharing.
Need a system: Name ÅÆ IP address
• URL-form name to access a specific web page. • The resources shared among several processes have consistent name used by these processes. • Users can communicate with each other by their email addresses.
Another way: attributes Name service stores a collection of bindings between name and attributes. Major operation: resolve a name General requirement: handle an arbitrary number of names and serve an arbitrary number of organizations; a long lifetime; high availability; fault isolation; tolerance of mistrust Name space: collection of all valid names. 1
When the size of Internet was small, a host file: two columns. Every host store one copy and update it periodically from a master host file. Impossible for today’s Internet One simple solution: server Disadvantages: inefficient; unreliable. Another solution: distribution & replication. client/server group model Names are unique Two ways to organize name space Flat: a name is a sequence of characters without structure • cannot be used in a large system such as the Internet.
2
1
Domain Name System (DNS) Hierarchy:
each name is composed of several
parts.
DNS in the Internet
DNS can be used in different platforms. generic domains
• called domain name space • each organization can choose the prefix name for its host independently.
In domain name space, names are defined in
an inverted-tree structure. Each node in the tree has a label, and a domain name. is a string with a maximum of 63 characters. Root label is an empty string Children of a node have different labels Domain name is a sequence of labels from the current node up to the root, separated by dots. Fully Qualified Domain Name (FQDN): a complete domain name Partially Qualified Domain Name (PQDN): a domain name is ended at some node except the 3 root
Label
country domains
com: commercial organizations edu: universities and other educational institutions gov: US governmental agencies mil:US military organizations net: major network support centers org: organizations not mentioned above int: international organizations ca: Canada; us: United States; … … Use their own domains to distinguish their organizations, except USA. i.e. co.uk, ac.uk
inverse domain
map an address to a name Example: a server has a list of authorized clients, but only IP address from packet. • the server may ask its resolver to send a query to the DNS server and ask for a mapping of address to name.
• inverse query (or pointer query) • “inverse-IP.in-addr.arpa”
4
2
DNS queries
Domain Name System (DNS)
Host name resolution Get IP addresses from host names
Distribution of name space
Looking up e-mail host
Reverse resolution Name server replies only if the IP address is in its own domain. Others in the textbook
DNS data are divided into zones, and each DNS server is responsible for zero or more zones.
URL
http://www.cdk3.net:80/WebExamples/earth.html
Resource ID (IP number, port number, pathname)
A master file for a zone (zone file): entered by system administrator. Root server:
80
WebExamples/earth.html
ARP lookup (Ethernet) Network address
file
2:60:8c:2:b0:5a Socket
Zones vs. domains Each zone must be hold by at least two servers.
DNS lookup
138.37.88.61
DNS servers: organized in the same way as the hierarchy of names. Each server contains part of the naming database – data for the local domain. Also, each server records the domain names and addresses of other servers.
a server whose domain consists of the whole tree. no detailed information, just maintains references to lower-level servers. Currently, there are more than 13 root servers distributed all around the world, each covering the whole domain name space.
Web server 5
6
3
Domain Name System (DNS)
Domain Name System (DNS)
Primary servers Read zone data directly from a local master file creating, maintaining, and updating the zone file
Secondary servers Download zone data from other servers (primary or other secondary) Communicate periodically with the primary server to check the match
Name-Address Resolution
Process calls a DNS client, called a resolver The resolver accesses the closest DNS server with a mapping request. Either server replies with the information, or tells the resolver that other servers have this information. the resolver delivers the result to the request process.
Most of requests are “Mapping Names to Addresses” Mapping Addresses to Names: DNS client (resolver)
Both of them are authorities for the zone
reverses the IP address, and appends it with “.in-addr.arpa.” to create a domain name.
they serve: redundancy
transfer: secondary server Æ primary server
Zone
Two approaches
A server can be primary server for a
specific zone, and a secondary server for another zone.
Recursive resolution: the resolver expects the server to supply the final answer Iterative Resolution • it returns to the client the IP address of the server that it thinks can resolve the query.
• The client is responsible to repeat the query to this second server. 7
8
4
Domain Name System (DNS)
DNS Messages
Caching technique in DNS
two simple techniques: “time-to-live” (TTL)
recursive resolution Store the mapping before send it to client One problem: cache some mapping for a long time. So the client receives an out-of-date mapping.
The header is 12 bytes
Original server binds a mapping with a TTL value. • It defines the time in seconds that the other servers can cache the mapping information. Receiving server sets a TTL for each mapping in its cache.
DNS Messages
Two types: query and response A query message consists of a header and the question records A response message consists of a header, question records, answer records, authority records, and additional records.
9
Identification: 16-bit, match the response (used by client) Flags: 16-bit • QR (query/response): 1-bit, defines the type of message • OpCode: 4-bit, defines type of query or response (0: standard, 1: inverse, etc.) • AA (authoritative answer): 1-bit, used in caching technique (1: original server) • TC (truncated):1-bit, 1 means the response was more than 512 bytes and reduced to 512. • RD (recursion desired):1-bit, 1 means the client desires a recursive answer. (set in query message, repeated in response message) • RA (recursion available):1-bit, 1 means that a recursive response is available. (set in the response message) • Reserved: 3-bit, “000” • rCode: 4-bit, error code in the response (only original server can set it) Number of question records: 16-bit Number of answer records: 16-bit, all 0s in query message Number of authority records: 16-bit, all 0s in query Number of additional records: 16-bit, all 0s in query 1 0
5
DNS Messages: types of records
Time
Question Record
Used by client to get information from a server Query name: domain name, variable-length field Query type: 16-bit, i.e., 1: 32-bit IPv4 address, 28: An IPv6 address, … Query class: 16-bit, defines specific protocol using DNS, i.e., 1: Internet; 2: CSNET network; …
Domain name Domain type Domain class Time-to-live: 32-bit, number of seconds Resource data length: 16-bit Resource data:
• answer to the query in answer section; • domain name of server in authoriy section • Additional information (IP address) in additional section
Precise time: ‘e-commerce’ transaction; authentication protocols; Check if the call message is a duplicated message and check if the call message is valid, in Sun RPC message, … the order of events is important: e-mail
Situation in distributed systems
Resource Record
important information in distributed systems.
no global clock in distributed systems Each computer has its own internal clock, and each clock has its own physical properties. clock drift rate: difference between a computer clock and the perfect reference clock Two approaches to correct • Time server, Cristian in 1989 • logical clock
Synchronizing physical clocks
External synchronization: clock-draft-rate is bounded by some constant. • Time server: Cristian’s method, the Network Time Protocol
1 1
Internal synchronization: the difference between any two computer clocks is bounded by some constant. • Master/slaves: the Berkeley’s algorithm
1 2
6
Cristian’s method: time server
The Berkeley’s algorithm
1. Client process sends a time request to time server. 2. After receiving a request, the server replies with the time according to its clock. Analysis no upper bound on message transmission delays. Its success is based on that the round-trip times for messages exchange are short compared with the required accuracy. a group of synchronized time servers • multicast its request to all the time servers in the LAN, and use the first replied time. • Better performance: – server failure, reply message omission failure; – the first replied time has smaller value (more close to the perfect time). 1 3
One computer is chosen to be a master The master computer periodically selects the other computers to synchronize their clocks, called slaves. The slaves send back their clock values to master. The master estimates their clock times, and computes the average values of all the clock times
T + (round-trip time/2).
The master sends the adjustment amount for each individual slave.
The reason for not sending the updated current time
to avoid the further uncertainty introduced by message transmission time
One possible problem: readings from faulty clocks One simple fix: select a subset of clocks whose mutual difference is bounded by some specified value 1 4
7
The Network Time Protocol (NTP)
The Network Time Protocol (NTP)
Cristian’s method, the Berkeley algorithm: intranets. The Network Time Protocol is a protocol to distribute time information over the Internet.
External synchronization: synchronize time to agree on coordinated universal time (UTC), with some fixed bound
The NTP system consists of a network of primary and secondary time servers, clients, and interconnecting transmission paths. (synchronization subnet) A primary time server is directly synchronized to a primary reference source, usually a timecode receiver A secondary time server is synchronized, possibly via other secondary servers, from a primary server over network paths. A hierarchy: primary reference source at the root Stratum: reference level
The minimum-weight spanning trees: the BellmanFord distributed routing algorithm
Primary time server: level 1 (stratum 1) increasing levels with decreasing accuracy.
1 5
stratum numbers heavy lines: the active synchronization paths Light lines: backup synchronization paths
Used for exchanging timing information, not necessarily for synchronizing local clocks
Direction: timing information flow If x is out of service, reconfigure with backup paths. 1 6
8
Modes of operation
NTP message
Multicast mode: high speed LANs with numerous computers and not require highest accuracies.
After IP, UDP header Timestamp: 64bits, 32-bit integer part for seconds + 32-bit fraction part, from January 1, 1900
procedure-call mode: higher accuracy, or Multicast mode is not available. Symmetric mode Used in distributed environment Pairs of servers exchange messages containing time information Symmetric active mode • used by servers operating near the leaves (high stratum levels) of the synchronization subnet and with preconfigured peer addresses. Symmetric
passive mode
• used by servers operating near the root (low stratum levels) and with a relatively large number of peers
1 7
Leap Indicator; Version number; operating mode; stratum number; local-clock precision Poll Interval (Poll): the maximum interval between 1 successive NTP messages. 8
9
NTP message
Filtering, and peer-selection algorithms
Synchronization Distance, Synchronization Dispersion: Indicate the roundtrip delay and dispersion, to the primary reference source.
Reference Clock Identifier, Reference Timestamp Identifies particular reference source and the time of its last update; used for management. Originate Timestamp The transmit timestamp in the last received NTP message (Ti-3). Receive Timestamp The local time when the latest NTP message was received (Ti-2). Transmit Timestamp The local time when this NTP message was transmitted (Ti-1). Authenticator (optional) The key identifier and encrypted checksum of the message contents.
1 9
Filtering algorithm: improve the offset estimate for a single peer clock Minimum filter: for a given peer clock, selects the sample with lowest delay from the n (i.e., 8) most recent samples These samples are sorted (delay ‘d’ value). filter dispersion: quality indicator
Peer-selection algorithm: find the best clocks from a population as synchronization source, to maintain high reliability. adjust the local-clock, stratum number Observation highest reliability is usually associated with the lowest stratum number and the lowest synchronization dispersion (accuracy) 2 0
10