Jump to content

Help : hive query


vendettaa

Recommended Posts

I have two tables in hive each has billions of records and 200 columns

i want to compare each column against other table column based on primary key and Trigger email containing mismatched records 

 

spark code is also fine 

 

Link to comment
Share on other sites

3 minutes ago, vendettaa said:

I have two tables in hive each has billions of records and 200 columns

i want to compare each column against other table column based on primary key and Trigger email containing mismatched records 

 

spark code is also fine 

 

Code not the fine ?

Link to comment
Share on other sites

1 minute ago, vendettaa said:

Edokati man 

e piece teliste yamls create chesi automate cheyali

spark or hive is fine 

Im not an expert but I think you can use except data frame api to perform this.

put table 1 data into data frame 1

table 2 data into another data frame 2.

dataframe1.select(keyColumn).except.dataframe2.select(keycolumn)

you will get data from dataframe 1 which is not present in df2. May not be a perfect answer but you can change it according to your use case.

Link to comment
Share on other sites

6 minutes ago, NPReddy said:

Im not an expert but I think you can use except data frame api to perform this.

put table 1 data into data frame 1

table 2 data into another data frame 2.

dataframe1.select(keyColumn).except.dataframe2.select(keycolumn)

you will get data from dataframe 1 which is not present in df2. May not be a perfect answer but you can change it according to your use case.

Ok how to do this on hive 

am not sure whether we have commands to invoke spark yaml but thanks 

Link to comment
Share on other sites

1 minute ago, vendettaa said:

Ok how to do this on hive 

am not sure whether we have commands to invoke spark yaml but thanks 

This is not in hive, read and process hive table data using spark, write a spark application which perform this or you can do it in spark shell directly. Im not sure about email. 

Link to comment
Share on other sites

1 minute ago, NPReddy said:

This is not in hive, read and process hive table data using spark, write a spark application which perform this or you can do it in spark shell directly. Im not sure about email. 

Ok without spark ela ani asking 

Link to comment
Share on other sites

Just now, vendettaa said:

Ok without spark ela ani asking 

Not sure buddy. Since you asked for spark, i just tried to help you. How do u write hive queries? In shell? I think, You can create some functions to performa this. Since you are dealing with billions of records, i do not think it is recommended. Wait for experts.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...