SQOOP experiments

[cloudera@localhost ~]$ sudo service mysqld start
Starting mysqld:                                           [  OK  ]
[cloudera@localhost ~]$ mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.73 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

EXPERIMENT 1:

mysql> create database sampledata;
Query OK, 1 row affected (0.00 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| db1                |
| mysql              |
| sampledata         |
| sampledb           |
| test               |
+--------------------+
6 rows in set (0.07 sec)

mysql> use sampledata;
Database changed
mysql> create table mentor(mno int);
Query OK, 0 rows affected (0.05 sec)

mysql> create table mentee(id int, name char);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into mentor values(101),(102),(103),(104),(105);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into mentee values(10,'a'),(20,'b'),(30,'c'),(40,'d'),(50,'e');
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> exit
Bye

EXPERIMENT 2:

[cloudera@localhost ~]$ sqoop list-databases --connect "jdbc:mysql://localhost"
--username root

21/10/19 06:06:43 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
db1
mysql
sampledata
sampledb
test
[cloudera@localhost ~]$ sqoop list-tables --connect "jdbc:mysql://localhost/sampledata"
--username root

21/10/19 06:07:16 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
mentee
mentor
[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "select *from mentee"

21/10/19 06:08:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
-------------------
| id          | name |
-------------------
| 10          | a |
| 20          | b |
| 30          | c |
| 40          | d |
| 50          | e |
-------------------
[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "select count(*) from mentor"

21/10/19 06:09:31 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
------------------------
| count(*)             |
------------------------
| 5                    |
------------------------
[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "select *  from mentor where mno>103"

21/10/19 06:11:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
---------------
| mno         |
---------------
| 104         |
| 105         |
---------------
[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "insert into mentee values(6,'f')"

21/10/19 06:15:14 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 06:15:15 INFO tool.EvalSqlTool: 1 row(s) updated.
[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata" --username root --query "select *  from mentee"
21/10/19 06:15:34 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
-------------------
| id          | name |
-------------------
| 10          | a |
| 20          | b |
| 30          | c |
| 40          | d |
| 50          | e |
| 6           | f |
-------------------

EXPERIMENT 3:

[cloudera@localhost ~]$ sqoop list-databases --connect "jdbc:mysql://localhost"
--username root

[cloudera@localhost ~]$ sqoop list-tables --connect "jdbc:mysql://localhost/sampledata"
--username root

21/10/19 06:33:20 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
mentee
mentor

[cloudera@localhost ~]$ hadoop fs -rm -R /user/cloudera/mentor
Moved: 'hdfs://localhost.localdomain:8020/user/cloudera/mentor' to trash at: hdfs://localhost.localdomain:8020/user/cloudera/.Trash/Current

[cloudera@localhost ~]$ hadoop fs -rm -R /user/cloudera/mentee
Moved: 'hdfs://localhost.localdomain:8020/user/cloudera/mentee' to trash at: hdfs://localhost.localdomain:8020/user/cloudera/.Trash/Current

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/mentor
ls: `/user/cloudera/mentor': No such file or directory

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/mentee
ls: `/user/cloudera/mentee': No such file or directory

[cloudera@localhost ~]$ sqoop import --connect "jdbc:mysql://localhost/sampledata"
--username root --table mentor -m 1

21/10/19 06:38:27 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 06:38:27 INFO tool.CodeGenTool: Beginning code generation
21/10/19 06:38:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:38:28 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:38:28 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
21/10/19 06:38:28 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/1fd8f0aa24117cb20be9f086034af550/mentor.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/10/19 06:38:34 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/1fd8f0aa24117cb20be9f086034af550/mentor.jar
21/10/19 06:38:34 WARN manager.MySQLManager: It looks like you are importing from mysql.
21/10/19 06:38:34 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
21/10/19 06:38:34 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
21/10/19 06:38:34 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
21/10/19 06:38:34 INFO mapreduce.ImportJobBase: Beginning import of mentor
21/10/19 06:38:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
21/10/19 06:38:40 INFO mapred.JobClient: Running job: job_202110190550_0004
21/10/19 06:38:41 INFO mapred.JobClient:  map 0% reduce 0%
21/10/19 06:39:00 INFO mapred.JobClient:  map 100% reduce 0%
21/10/19 06:39:05 INFO mapred.JobClient: Job complete: job_202110190550_0004
21/10/19 06:39:05 INFO mapred.JobClient: Counters: 23
21/10/19 06:39:05 INFO mapred.JobClient:   File System Counters
21/10/19 06:39:05 INFO mapred.JobClient:     FILE: Number of bytes read=0
21/10/19 06:39:05 INFO mapred.JobClient:     FILE: Number of bytes written=175163
21/10/19 06:39:05 INFO mapred.JobClient:     FILE: Number of read operations=0
21/10/19 06:39:05 INFO mapred.JobClient:     FILE: Number of large read operations=0
21/10/19 06:39:05 INFO mapred.JobClient:     FILE: Number of write operations=0
21/10/19 06:39:05 INFO mapred.JobClient:     HDFS: Number of bytes read=87
21/10/19 06:39:05 INFO mapred.JobClient:     HDFS: Number of bytes written=20
21/10/19 06:39:05 INFO mapred.JobClient:     HDFS: Number of read operations=1
21/10/19 06:39:05 INFO mapred.JobClient:     HDFS: Number of large read operations=0
21/10/19 06:39:05 INFO mapred.JobClient:     HDFS: Number of write operations=1
21/10/19 06:39:05 INFO mapred.JobClient:   Job Counters
21/10/19 06:39:05 INFO mapred.JobClient:     Launched map tasks=1
21/10/19 06:39:05 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=22059
21/10/19 06:39:05 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
21/10/19 06:39:05 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
21/10/19 06:39:05 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
21/10/19 06:39:05 INFO mapred.JobClient:   Map-Reduce Framework
21/10/19 06:39:05 INFO mapred.JobClient:     Map input records=5
21/10/19 06:39:05 INFO mapred.JobClient:     Map output records=5
21/10/19 06:39:05 INFO mapred.JobClient:     Input split bytes=87
21/10/19 06:39:05 INFO mapred.JobClient:     Spilled Records=0
21/10/19 06:39:05 INFO mapred.JobClient:     CPU time spent (ms)=250
21/10/19 06:39:05 INFO mapred.JobClient:     Physical memory (bytes) snapshot=98820096
21/10/19 06:39:05 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=656605184
21/10/19 06:39:05 INFO mapred.JobClient:     Total committed heap usage (bytes)=60751872
21/10/19 06:39:05 INFO mapreduce.ImportJobBase: Transferred 20 bytes in 30.3395 seconds (0.6592 bytes/sec)
21/10/19 06:39:05 INFO mapreduce.ImportJobBase: Retrieved 5 records.

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera
Found 9 items
drwx------   - cloudera cloudera          0 2019-12-17 22:29 /user/cloudera/.Trash
drwx------   - cloudera cloudera          0 2021-10-19 06:39 /user/cloudera/.staging
drwxr-xr-x   - cloudera cloudera          0 2021-10-15 01:39 /user/cloudera/_sqoop
drwxr-xr-x   - cloudera cloudera          0 2021-10-15 01:19 /user/cloudera/hp2
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:39 /user/cloudera/mentor
drwxr-xr-x   - cloudera cloudera          0 2021-10-12 01:57 /user/cloudera/stu
-rw-r--r--   3 cloudera cloudera         12 2021-10-12 02:11 /user/cloudera/stu1
drwxr-xr-x   - cloudera cloudera          0 2021-10-12 01:58 /user/cloudera/tea
-rw-r--r--   3 cloudera cloudera         12 2021-10-15 01:43 /user/cloudera/test1

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/mentor
Found 3 items
-rw-r--r--   3 cloudera cloudera          0 2021-10-19 06:39 /user/cloudera/mentor/_SUCCESS
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:38 /user/cloudera/mentor/_logs
-rw-r--r--   3 cloudera cloudera         20 2021-10-19 06:38 /user/cloudera/mentor/part-m-00000

[cloudera@localhost ~]$ hadoop fs -cat /user/cloudera/mentor/part*
101
102
103
104
105

[cloudera@localhost ~]$ sqoop import --connect "jdbc:mysql://localhost/sampledata"
--username root --table mentor --target-dir /user/cloudera/hat1 -m 1

21/10/19 06:46:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 06:46:54 INFO tool.CodeGenTool: Beginning code generation
21/10/19 06:46:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:46:55 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:46:55 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
21/10/19 06:46:55 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/1e263d0a41690e9153daec3aba95b7ef/mentor.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/10/19 06:47:00 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/1e263d0a41690e9153daec3aba95b7ef/mentor.jar
21/10/19 06:47:00 WARN manager.MySQLManager: It looks like you are importing from mysql.
21/10/19 06:47:00 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
21/10/19 06:47:00 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
21/10/19 06:47:00 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
21/10/19 06:47:00 INFO mapreduce.ImportJobBase: Beginning import of mentor
21/10/19 06:47:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
21/10/19 06:47:07 INFO mapred.JobClient: Running job: job_202110190550_0006
21/10/19 06:47:08 INFO mapred.JobClient:  map 0% reduce 0%
21/10/19 06:47:27 INFO mapred.JobClient:  map 100% reduce 0%
21/10/19 06:47:33 INFO mapred.JobClient: Job complete: job_202110190550_0006
21/10/19 06:47:33 INFO mapred.JobClient: Counters: 23
21/10/19 06:47:33 INFO mapred.JobClient:   File System Counters
21/10/19 06:47:33 INFO mapred.JobClient:     FILE: Number of bytes read=0
21/10/19 06:47:33 INFO mapred.JobClient:     FILE: Number of bytes written=175172
21/10/19 06:47:33 INFO mapred.JobClient:     FILE: Number of read operations=0
21/10/19 06:47:33 INFO mapred.JobClient:     FILE: Number of large read operations=0
21/10/19 06:47:33 INFO mapred.JobClient:     FILE: Number of write operations=0
21/10/19 06:47:33 INFO mapred.JobClient:     HDFS: Number of bytes read=87
21/10/19 06:47:33 INFO mapred.JobClient:     HDFS: Number of bytes written=20
21/10/19 06:47:33 INFO mapred.JobClient:     HDFS: Number of read operations=1
21/10/19 06:47:33 INFO mapred.JobClient:     HDFS: Number of large read operations=0
21/10/19 06:47:33 INFO mapred.JobClient:     HDFS: Number of write operations=1
21/10/19 06:47:33 INFO mapred.JobClient:   Job Counters
21/10/19 06:47:33 INFO mapred.JobClient:     Launched map tasks=1
21/10/19 06:47:33 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=21067
21/10/19 06:47:33 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
21/10/19 06:47:33 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
21/10/19 06:47:33 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
21/10/19 06:47:33 INFO mapred.JobClient:   Map-Reduce Framework
21/10/19 06:47:33 INFO mapred.JobClient:     Map input records=5
21/10/19 06:47:33 INFO mapred.JobClient:     Map output records=5
21/10/19 06:47:33 INFO mapred.JobClient:     Input split bytes=87
21/10/19 06:47:33 INFO mapred.JobClient:     Spilled Records=0
21/10/19 06:47:33 INFO mapred.JobClient:     CPU time spent (ms)=240
21/10/19 06:47:33 INFO mapred.JobClient:     Physical memory (bytes) snapshot=94601216
21/10/19 06:47:33 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=656605184
21/10/19 06:47:33 INFO mapred.JobClient:     Total committed heap usage (bytes)=60751872
21/10/19 06:47:33 INFO mapreduce.ImportJobBase: Transferred 20 bytes in 31.093 seconds (0.6432 bytes/sec)
21/10/19 06:47:33 INFO mapreduce.ImportJobBase: Retrieved 5 records.

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/hat1
Found 3 items
-rw-r--r--   3 cloudera cloudera          0 2021-10-19 06:47 /user/cloudera/hat1/_SUCCESS
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:47 /user/cloudera/hat1/_logs
-rw-r--r--   3 cloudera cloudera         20 2021-10-19 06:47 /user/cloudera/hat1/part-m-00000

[cloudera@localhost ~]$ hadoop fs -cat /user/cloudera/hat1/part*
101
102
103
104
105

[cloudera@localhost ~]$ sqoop import --connect "jdbc:mysql://localhost/sampledata"
--username root --table mentor --where "mno>'102'"
--target-dir /user/cloudera/hat1/lab1 -m 1

21/10/19 06:52:31 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 06:52:31 INFO tool.CodeGenTool: Beginning code generation
21/10/19 06:52:32 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:52:32 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 06:52:32 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
21/10/19 06:52:32 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/bbb5f073950ee93a796e9da3d27af988/mentor.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/10/19 06:52:37 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/bbb5f073950ee93a796e9da3d27af988/mentor.jar
21/10/19 06:52:37 WARN manager.MySQLManager: It looks like you are importing from mysql.
21/10/19 06:52:37 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
21/10/19 06:52:37 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
21/10/19 06:52:37 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
21/10/19 06:52:37 INFO mapreduce.ImportJobBase: Beginning import of mentor
21/10/19 06:52:40 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
21/10/19 06:52:43 INFO mapred.JobClient: Running job: job_202110190550_0007
21/10/19 06:52:44 INFO mapred.JobClient:  map 0% reduce 0%
21/10/19 06:53:03 INFO mapred.JobClient:  map 100% reduce 0%
21/10/19 06:53:08 INFO mapred.JobClient: Job complete: job_202110190550_0007
21/10/19 06:53:08 INFO mapred.JobClient: Counters: 23
21/10/19 06:53:08 INFO mapred.JobClient:   File System Counters
21/10/19 06:53:08 INFO mapred.JobClient:     FILE: Number of bytes read=0
21/10/19 06:53:08 INFO mapred.JobClient:     FILE: Number of bytes written=175593
21/10/19 06:53:08 INFO mapred.JobClient:     FILE: Number of read operations=0
21/10/19 06:53:08 INFO mapred.JobClient:     FILE: Number of large read operations=0
21/10/19 06:53:08 INFO mapred.JobClient:     FILE: Number of write operations=0
21/10/19 06:53:08 INFO mapred.JobClient:     HDFS: Number of bytes read=87
21/10/19 06:53:08 INFO mapred.JobClient:     HDFS: Number of bytes written=12
21/10/19 06:53:08 INFO mapred.JobClient:     HDFS: Number of read operations=1
21/10/19 06:53:08 INFO mapred.JobClient:     HDFS: Number of large read operations=0
21/10/19 06:53:08 INFO mapred.JobClient:     HDFS: Number of write operations=1
21/10/19 06:53:08 INFO mapred.JobClient:   Job Counters
21/10/19 06:53:08 INFO mapred.JobClient:     Launched map tasks=1
21/10/19 06:53:08 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=20855
21/10/19 06:53:08 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
21/10/19 06:53:08 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
21/10/19 06:53:08 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
21/10/19 06:53:08 INFO mapred.JobClient:   Map-Reduce Framework
21/10/19 06:53:08 INFO mapred.JobClient:     Map input records=3
21/10/19 06:53:08 INFO mapred.JobClient:     Map output records=3
21/10/19 06:53:08 INFO mapred.JobClient:     Input split bytes=87
21/10/19 06:53:08 INFO mapred.JobClient:     Spilled Records=0
21/10/19 06:53:08 INFO mapred.JobClient:     CPU time spent (ms)=260
21/10/19 06:53:08 INFO mapred.JobClient:     Physical memory (bytes) snapshot=100413440
21/10/19 06:53:08 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=656605184
21/10/19 06:53:08 INFO mapred.JobClient:     Total committed heap usage (bytes)=60751872
21/10/19 06:53:08 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 29.8409 seconds (0.4021 bytes/sec)
21/10/19 06:53:08 INFO mapreduce.ImportJobBase: Retrieved 3 records.

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera
Found 11 items
drwx------   - cloudera cloudera          0 2019-12-17 22:29 /user/cloudera/.Trash
drwx------   - cloudera cloudera          0 2021-10-19 06:53 /user/cloudera/.staging
drwxr-xr-x   - cloudera cloudera          0 2021-10-15 01:39 /user/cloudera/_sqoop
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:44 /user/cloudera/hat
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:52 /user/cloudera/hat1
drwxr-xr-x   - cloudera cloudera          0 2021-10-15 01:19 /user/cloudera/hp2
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:39 /user/cloudera/mentor
drwxr-xr-x   - cloudera cloudera          0 2021-10-12 01:57 /user/cloudera/stu
-rw-r--r--   3 cloudera cloudera         12 2021-10-12 02:11 /user/cloudera/stu1
drwxr-xr-x   - cloudera cloudera          0 2021-10-12 01:58 /user/cloudera/tea
-rw-r--r--   3 cloudera cloudera         12 2021-10-15 01:43 /user/cloudera/test1

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera/hat1
Found 4 items
-rw-r--r--   3 cloudera cloudera          0 2021-10-19 06:47 /user/cloudera/hat1/_SUCCESS
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:47 /user/cloudera/hat1/_logs
drwxr-xr-x   - cloudera cloudera          0 2021-10-19 06:53 /user/cloudera/hat1/lab1
-rw-r--r--   3 cloudera cloudera         20 2021-10-19 06:47 /user/cloudera/hat1/part-m-00000

[cloudera@localhost ~]$ hadoop fs -cat /user/cloudera/hat1/lab1/part*
103
104
105

[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "insert into mentor values (106),(107),(108)"

21/10/19 07:02:15 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 07:02:16 INFO tool.EvalSqlTool: 3 row(s) updated.

[cloudera@localhost ~]$ sqoop import --connect "jdbc:mysql://localhost/sampledata"
--username root --table mentor --target-dir /user/cloudera/hat1/lab1
--incremental append --check-column mno --last-value 105 -m 1

21/10/19 07:06:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/19 07:06:25 INFO tool.CodeGenTool: Beginning code generation
21/10/19 07:06:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 07:06:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `mentor` AS t LIMIT 1
21/10/19 07:06:26 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
21/10/19 07:06:26 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/4c1c9bcdd05df90850bd31d01931160b/mentor.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/10/19 07:06:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/4c1c9bcdd05df90850bd31d01931160b/mentor.jar
21/10/19 07:06:32 INFO tool.ImportTool: Maximal id query for free form incremental import: SELECT MAX(`mno`) FROM mentor
21/10/19 07:06:32 INFO tool.ImportTool: Incremental import based on column `mno`
21/10/19 07:06:32 INFO tool.ImportTool: Lower bound value: 105
21/10/19 07:06:32 INFO tool.ImportTool: Upper bound value: 108
21/10/19 07:06:32 WARN manager.MySQLManager: It looks like you are importing from mysql.
21/10/19 07:06:32 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
21/10/19 07:06:32 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
21/10/19 07:06:32 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
21/10/19 07:06:32 INFO mapreduce.ImportJobBase: Beginning import of mentor
21/10/19 07:06:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
21/10/19 07:06:38 INFO mapred.JobClient: Running job: job_202110190550_0008
21/10/19 07:06:39 INFO mapred.JobClient:  map 0% reduce 0%
21/10/19 07:06:59 INFO mapred.JobClient:  map 100% reduce 0%
21/10/19 07:07:04 INFO mapred.JobClient: Job complete: job_202110190550_0008
21/10/19 07:07:04 INFO mapred.JobClient: Counters: 23
21/10/19 07:07:04 INFO mapred.JobClient:   File System Counters
21/10/19 07:07:04 INFO mapred.JobClient:     FILE: Number of bytes read=0
21/10/19 07:07:04 INFO mapred.JobClient:     FILE: Number of bytes written=175623
21/10/19 07:07:04 INFO mapred.JobClient:     FILE: Number of read operations=0
21/10/19 07:07:04 INFO mapred.JobClient:     FILE: Number of large read operations=0
21/10/19 07:07:04 INFO mapred.JobClient:     FILE: Number of write operations=0
21/10/19 07:07:04 INFO mapred.JobClient:     HDFS: Number of bytes read=87
21/10/19 07:07:04 INFO mapred.JobClient:     HDFS: Number of bytes written=20
21/10/19 07:07:04 INFO mapred.JobClient:     HDFS: Number of read operations=1
21/10/19 07:07:04 INFO mapred.JobClient:     HDFS: Number of large read operations=0
21/10/19 07:07:04 INFO mapred.JobClient:     HDFS: Number of write operations=1
21/10/19 07:07:04 INFO mapred.JobClient:   Job Counters
21/10/19 07:07:04 INFO mapred.JobClient:     Launched map tasks=1
21/10/19 07:07:04 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=21962
21/10/19 07:07:04 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
21/10/19 07:07:04 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
21/10/19 07:07:04 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
21/10/19 07:07:04 INFO mapred.JobClient:   Map-Reduce Framework
21/10/19 07:07:04 INFO mapred.JobClient:     Map input records=5
21/10/19 07:07:04 INFO mapred.JobClient:     Map output records=5
21/10/19 07:07:04 INFO mapred.JobClient:     Input split bytes=87
21/10/19 07:07:04 INFO mapred.JobClient:     Spilled Records=0
21/10/19 07:07:04 INFO mapred.JobClient:     CPU time spent (ms)=210
21/10/19 07:07:04 INFO mapred.JobClient:     Physical memory (bytes) snapshot=99799040
21/10/19 07:07:04 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=656605184
21/10/19 07:07:04 INFO mapred.JobClient:     Total committed heap usage (bytes)=60751872
21/10/19 07:07:04 INFO mapreduce.ImportJobBase: Transferred 20 bytes in 31.0254 seconds (0.6446 bytes/sec)
21/10/19 07:07:04 INFO mapreduce.ImportJobBase: Retrieved 5 records.
21/10/19 07:07:04 INFO util.AppendUtils: Appending to directory lab1
21/10/19 07:07:04 INFO util.AppendUtils: Using found partition 1
21/10/19 07:07:04 INFO tool.ImportTool: Incremental import complete! To run another incremental import of all data following this import, supply the following arguments:
21/10/19 07:07:04 INFO tool.ImportTool:  --incremental append
21/10/19 07:07:04 INFO tool.ImportTool:   --check-column mno
21/10/19 07:07:04 INFO tool.ImportTool:   --last-value 108
21/10/19 07:07:04 INFO tool.ImportTool: (Consider saving this with 'sqoop job --create')

[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/sampledata"
--username root --query "select * from mentor"

21/10/19 07:08:05 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
---------------
| mno         |
---------------
| 101         |
| 102         |
| 103         |
| 104         |
| 105         |
| 106         |
| 107         |
| 108         |
---------------


EXPERIMENT 4:

[cloudera@localhost ~]$ gedit kk1
       1,a
       2,b
       3,c
    save & exit

[cloudera@localhost ~]$ hadoop fs -put kk1 /user/cloudera

[cloudera@localhost ~]$ hadoop fs -ls /user/cloudera
Found 57 items
drwx------   - cloudera cloudera            0 2020-03-09 20:43 /user/cloudera/.Trash
drwx------   - cloudera cloudera            0 2021-10-15 22:32 /user/cloudera/.staging
drwxr-xr-x   - cloudera cloudera            0 2021-10-15 22:32 /user/cloudera/_sqoop
-rw-r--r--   3 cloudera cloudera          282 2021-02-03 19:55 /user/cloudera/actor
-rw-r--r--   3 cloudera cloudera         1564 2021-02-03 22:03 /user/cloudera/actor1
-rw-r--r--   3 cloudera cloudera          212 2021-02-04 01:24 /user/cloudera/ara
drwxr-xr-x   - cloudera cloudera            0 2020-11-19 21:43 /user/cloudera/b3
-rw-r--r--   3 cloudera cloudera        59994 2020-11-24 20:48 /user/cloudera/cse553
-rw-r--r--   3 cloudera cloudera           28 2020-03-13 03:43 /user/cloudera/data
-rw-r--r--   3 cloudera cloudera           83 2020-03-11 22:59 /user/cloudera/datafile
drwxr-xr-x   - cloudera cloudera            0 2021-01-17 22:56 /user/cloudera/deepti
-rw-r--r--   3 cloudera cloudera         3101 2020-11-12 00:55 /user/cloudera/demo
-rw-r--r--   3 cloudera cloudera          155 2020-03-11 21:11 /user/cloudera/doctor
-rw-r--r--   3 cloudera cloudera         6201 2020-11-19 21:36 /user/cloudera/f1
-rw-r--r--   3 cloudera cloudera           67 2021-02-03 00:46 /user/cloudera/f2
-rw-r--r--   3 cloudera cloudera          110 2020-03-12 21:26 /user/cloudera/gobind
drwxr-xr-x   - cloudera supergroup          0 2020-11-26 00:49 /user/cloudera/hio
-rw-r--r--   3 cloudera cloudera           16 2021-10-15 22:34 /user/cloudera/kk1
drwxr-xr-x   - cloudera cloudera            0 2021-02-03 00:57 /user/cloudera/l2out
-rw-r--r--   3 cloudera cloudera           68 2021-10-11 22:48 /user/cloudera/lab1sem7.txt
-rw-r--r--   3 cloudera cloudera           21 2020-03-11 22:03 /user/cloudera/math
-rw-r--r--   3 cloudera cloudera          135 2021-01-28 00:25 /user/cloudera/movie
drwxr-xr-x   - cloudera cloudera            0 2021-10-11 03:12 /user/cloudera/nh001
-rw-r--r--   3 cloudera cloudera            0 2021-01-28 01:04 /user/cloudera/op
drwxr-xr-x   - cloudera cloudera            0 2020-11-24 21:02 /user/cloudera/op_grep
drwxr-xr-x   - cloudera cloudera            0 2020-11-24 20:55 /user/cloudera/opcse553
drwxr-xr-x   - cloudera cloudera            0 2021-01-17 23:50 /user/cloudera/out1
drwxr-xr-x   - cloudera cloudera            0 2021-01-18 00:03 /user/cloudera/out2
drwxr-xr-x   - cloudera cloudera            0 2021-02-03 22:08 /user/cloudera/outactor
drwxr-xr-x   - cloudera cloudera            0 2020-11-12 01:10 /user/cloudera/outp1
-rw-r--r--   3 cloudera cloudera            0 2021-01-28 01:42 /user/cloudera/output
drwxr-xr-x   - cloudera cloudera            0 2021-10-11 23:02 /user/cloudera/output1
drwxr-xr-x   - cloudera cloudera            0 2021-02-03 19:59 /user/cloudera/outputa
-rw-r--r--   3 cloudera cloudera          191 2020-03-11 21:11 /user/cloudera/patient
drwxr-xr-x   - cloudera cloudera            0 2021-10-11 00:15 /user/cloudera/pig_b3
-rw-r--r--   3 cloudera cloudera          190 2021-02-03 20:19 /user/cloudera/players
-rw-r--r--   3 cloudera cloudera          206 2020-03-09 20:51 /user/cloudera/pro
-rw-r--r--   3 cloudera cloudera           45 2020-03-09 20:51 /user/cloudera/q.csv
drwxr-xr-x   - cloudera cloudera            0 2021-10-13 03:56 /user/cloudera/saisreeja
-rw-r--r--   3 cloudera cloudera          206 2020-03-10 20:47 /user/cloudera/samp148.csv
drwxr-xr-x   - cloudera cloudera            0 2021-10-15 22:28 /user/cloudera/sat
drwxr-xr-x   - cloudera cloudera            0 2020-03-11 23:30 /user/cloudera/script1_086
drwxr-xr-x   - cloudera cloudera            0 2020-03-11 23:12 /user/cloudera/script_086
drwxr-xr-x   - cloudera cloudera            0 2020-03-12 21:50 /user/cloudera/script_o7
drwxr-xr-x   - cloudera cloudera            0 2020-03-12 21:57 /user/cloudera/script_o8
-rw-r--r--   3 cloudera cloudera          145 2020-03-09 22:51 /user/cloudera/split.txt
drwxr-xr-x   - cloudera cloudera            0 2021-02-03 01:25 /user/cloudera/student
drwxr-xr-x   - cloudera cloudera            0 2021-10-15 22:14 /user/cloudera/sty
drwxr-xr-x   - cloudera cloudera            0 2020-11-12 02:07 /user/cloudera/t1
drwxr-xr-x   - cloudera cloudera            0 2020-11-12 01:58 /user/cloudera/t2
drwxr-xr-x   - cloudera cloudera            0 2021-10-15 22:07 /user/cloudera/teach
drwxr-xr-x   - cloudera cloudera            0 2021-02-03 01:25 /user/cloudera/teacher
-rw-r--r--   3 cloudera cloudera           13 2021-02-03 01:46 /user/cloudera/test
-rw-r--r--   3 cloudera cloudera         8359 2020-11-18 01:22 /user/cloudera/text.csv
-rw-r--r--   3 cloudera cloudera           64 2021-02-03 20:04 /user/cloudera/train
-rw-r--r--   3 cloudera cloudera           64 2021-02-03 22:14 /user/cloudera/train1
-rw-r--r--   3 cloudera cloudera           64 2021-02-03 22:30 /user/cloudera/train11

[cloudera@localhost ~]$ hadoop fs -cat /user/cloudera/kk1
1,a
2,b
3,c
4,d

[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/dd1"
--username root --query "create table table_kk1(a int , b char)"

21/10/15 22:37:02 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/15 22:37:03 INFO tool.EvalSqlTool: 0 row(s) updated.

[cloudera@localhost ~]$ sqoop export --connect "jdbc:mysql://localhost/dd1"
--username root --table table_kk1 --export-dir /user/cloudera/kk1

21/10/15 22:38:05 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/10/15 22:38:05 INFO tool.CodeGenTool: Beginning code generation
21/10/15 22:38:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `table_kk1` AS t LIMIT 1
21/10/15 22:38:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `table_kk1` AS t LIMIT 1
21/10/15 22:38:05 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
21/10/15 22:38:05 INFO orm.CompilationManager: Found hadoop core jar at: /usr/lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note: /tmp/sqoop-cloudera/compile/9db04b8e4401494dec22682ee84857be/table_kk1.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/10/15 22:38:08 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/9db04b8e4401494dec22682ee84857be/table_kk1.jar
21/10/15 22:38:08 INFO mapreduce.ExportJobBase: Beginning export of table_kk1
21/10/15 22:38:09 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
21/10/15 22:38:10 INFO input.FileInputFormat: Total input paths to process : 1
21/10/15 22:38:10 INFO input.FileInputFormat: Total input paths to process : 1
21/10/15 22:38:10 INFO mapred.JobClient: Running job: job_202110152121_0009
21/10/15 22:38:11 INFO mapred.JobClient:  map 0% reduce 0%
21/10/15 22:38:46 INFO mapred.JobClient:  map 25% reduce 0%
21/10/15 22:38:49 INFO mapred.JobClient:  map 50% reduce 0%
21/10/15 22:38:57 INFO mapred.JobClient:  map 75% reduce 0%
21/10/15 22:38:58 INFO mapred.JobClient:  map 100% reduce 0%
21/10/15 22:39:00 INFO mapred.JobClient: Job complete: job_202110152121_0009
21/10/15 22:39:00 INFO mapred.JobClient: Counters: 24
21/10/15 22:39:00 INFO mapred.JobClient:   File System Counters
21/10/15 22:39:00 INFO mapred.JobClient:     FILE: Number of bytes read=0
21/10/15 22:39:00 INFO mapred.JobClient:     FILE: Number of bytes written=697132
21/10/15 22:39:00 INFO mapred.JobClient:     FILE: Number of read operations=0
21/10/15 22:39:00 INFO mapred.JobClient:     FILE: Number of large read operations=0
21/10/15 22:39:00 INFO mapred.JobClient:     FILE: Number of write operations=0
21/10/15 22:39:00 INFO mapred.JobClient:     HDFS: Number of bytes read=580
21/10/15 22:39:00 INFO mapred.JobClient:     HDFS: Number of bytes written=0
21/10/15 22:39:00 INFO mapred.JobClient:     HDFS: Number of read operations=16
21/10/15 22:39:00 INFO mapred.JobClient:     HDFS: Number of large read operations=0
21/10/15 22:39:00 INFO mapred.JobClient:     HDFS: Number of write operations=0
21/10/15 22:39:00 INFO mapred.JobClient:   Job Counters
21/10/15 22:39:00 INFO mapred.JobClient:     Launched map tasks=4
21/10/15 22:39:00 INFO mapred.JobClient:     Data-local map tasks=4
21/10/15 22:39:00 INFO mapred.JobClient:     Total time spent by all maps in occupied slots (ms)=82920
21/10/15 22:39:00 INFO mapred.JobClient:     Total time spent by all reduces in occupied slots (ms)=0
21/10/15 22:39:00 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
21/10/15 22:39:00 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
21/10/15 22:39:00 INFO mapred.JobClient:   Map-Reduce Framework
21/10/15 22:39:00 INFO mapred.JobClient:     Map input records=4
21/10/15 22:39:00 INFO mapred.JobClient:     Map output records=4
21/10/15 22:39:00 INFO mapred.JobClient:     Input split bytes=528
21/10/15 22:39:00 INFO mapred.JobClient:     Spilled Records=0
21/10/15 22:39:00 INFO mapred.JobClient:     CPU time spent (ms)=2890
21/10/15 22:39:00 INFO mapred.JobClient:     Physical memory (bytes) snapshot=424091648
21/10/15 22:39:00 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2617999360
21/10/15 22:39:00 INFO mapred.JobClient:     Total committed heap usage (bytes)=243007488
21/10/15 22:39:00 INFO mapreduce.ExportJobBase: Transferred 580 bytes in 51.1094 seconds (11.3482 bytes/sec)
21/10/15 22:39:00 INFO mapreduce.ExportJobBase: Exported 4 records.

[cloudera@localhost ~]$ sqoop eval --connect "jdbc:mysql://localhost/dd1"
--username root --query "select * from table_kk1"

21/10/15 22:41:47 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
-------------------
| a           | b |
-------------------
| 1           | a |
| 2           | b |
| 3           | c |
| 4           | d |
-------------------