What is the purpose of the Secondary NameNode in Hadoop? Question 8Answer a. To manage task scheduling b. To merge and backup filesystem metadata c. To handle data processing tasks d. To store data replicas
Added by Lauren M.
Step 1
The Secondary NameNode is not a replacement for the primary NameNode but serves a specific function related to filesystem metadata. Show more…
Show all steps
Your feedback will help us improve your experience
Madhur L and 54 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
Madhur L.
Which of the following statements are true?Select one or more:a. SQL Server Agent allows you to automate a variety of administrative taskb. Backup could be used to restore small numbers of files after they have been accidentally deletedc. SQL server agent is used to set a schedule for automatic backups to be done on a regular basis.d. To secure data a schedule could be set for automatic backupse. A customized maintenance plan could be created using jobs
Yujie W.
HW 3 HDFS—Lecture 5 Name: ID: Consider a small cluster with 20 machines: 19 DataNodes and 1 NameNode. Each node in the cluster has a total of 2 Terabyte hard disk space and 2 Gigabyte of main memory available. The cluster uses a block-size of 64 Megabytes (MB) and a replication factor of 3. The master maintains 100 bytes of metadata for each 64MB block. (a) Let’s upload the file wiki_dump.xml (with a size of 600 Megabytes) to HDFS. Explain what effect this upload has on the number of occupied HDFS blocks. (b) Figure 1 shows an excerpt of wiki_dump.xml’s structure. Explain the relationship between an HDFS block, an InputSplit and a record based on this excerpt. <dump time="1483027930"> <page id="EN3234"> ... ... ... </page> } 80.2 MB <page id="DE5434"> ... ... ... </page> } 0.6 MB ... </dump> Figure 1: Excerpt of wiki_dump.xml. Each Wikipedia page is stored within an element. The element with id EN3234 contains 80.2 Megabytes of textual content. (c) You are the only user of the cluster and write a Hadoop job to extract information from wiki_dump.xml. You want to speed up the job by testing different block size configuration: besides the existing 64 MB configuration, you also consider 32 MB and 128 MB block sizes. Which configuration do you think will lead to the fastest job execution? Explain why. (d) Let us assume no files are currently stored on HDFS. You are given 100 million files, each one with a size of 100 Kilobytes. How many of those can you upload successfully to the cluster, considering the storage restrictions (memory/disk) on the NameNode and the DataNodes? Explain your answer.
Akash M.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD