Posts

PySpark Installation

 Do pyspark fun it will be, they said. I had a pretty frustrating week just trying to install and get it working. primarily because I am not very technical and secondary, everything is a huge clusterfuck. Where do we start, First to use PySpark - You need Spark - and to use Spark ideally it should be installed on hadoop platform.  To install hadoop you can download CDH from cloudera website as my friends did. Steps I followed : Install virtual machine on your windows system  Everyone preferred Oracle's virtual box because free and trusted Done..Easy Peasy Download CDH ISO file from cloudera This is very things started getting bonkers Cloudera did away with download of CDH in favor of CDP. What is CDP you ask ? It is like combination of CDH and HDP. You don't care about it ? Fair enough neither did I Problem is for CDH they give ISO file for CDP they give commands which you can run on linux machine to download it, that for 60 days only Now I realize I have to download linux OS o

Exadata Scripts and Info

select name,value/1024/1024 as mb from v$statname natural join v$mystat where name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index'); select name,value byt from v$statname natural join v$mystat where name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index'); select name,value byt from v$statname natural join v$sysstat where name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index'); select * from (select first_name, salary, avg(salary) over (partition by department_id order by employee_id) as avg_sal from emp_lg_rnd) where salary > avg_sal CREATE USER APP1 IDENTIFIED BY APP1 DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP; GRANT DBA TO APP1; sqlplus app1/APP1 create table sales as select 'Oracle Enterprise Edition' as produc

Non Ascii data Oracle

-- Oracle Query to identify non ascii data SELECT 'PRICING' , 'PROD_CHAN' , 'PROD_CHAN_TYP_NM' , PROD_CHAN_TYP_NM FROM   PRICING . PROD_CHAN   WHERE LENGTH ( ASCIISTR ( PROD_CHAN_TYP_NM )) <> LENGTH ( PROD_CHAN_TYP_NM ) ; -- Oracle Query to identify non ascii data  SELECT 'PRICING' , 'PROD_CHAN' , 'PROD_CHAN_TYP_NM' , PROD_CHAN_TYP_NM FROM   PRICING . PROD_CHAN   WHERE REPLACE ( TRANSLATE ( LOWER ( PROD_CHAN_TYP_NM ) , 'abcdefghijklmnopqrstuvwxyz1234567890_-+=() *&^%$#@!~`:}{][|\/?.,><"'''||chr ( 10 )|| chr ( 13 )|| chr ( 9 ), 'X' ), 'X' , '' ) ;

BTEQ Wrapper

#! /bin/ksh filename=$1 echo $filename . /pkg1/Informatica/PowerCenter860/server/infa_shared/UnixScripts/DataInt/TD/hoovers_etl.env bteq <<stop>/pkg1/Informatica/PowerCenter860/server/infa_shared/UnixScripts/DataInt/TD/logs/loader/$(date +"20%y%m%d")/$(echo $filename|cut -c85-100).log 2>&1 logon $TDPID/$TDUSER,$TDPASS; .RUN FILE=$filename; EXIT; STOP

My TD Codes

REPLACE MACRO BTEQ_EXPORT ( DATABASENAME VARCHAR(100), TABLENAME VARCHAR(100), FILENAME VARCHAR(500), SERVER VARCHAR(100) default '' , USERNAME VARCHAR(300) default '', PASS VARCHAR(15) default 'password' ) AS ( SEL '                   .SESSIONS 5 .SET ECHOREQ ON .SET FORMAT OFF .SET QUIET ON .SET WIDTH 1000 .SET RETCANCEL ON .LOGON  '   || :SERVER || '/' || :USERNAME || ', ' || :PASS || '; '  ||     ' .EXPORT REPORT FILE =' || :FILENAME BTEQ_SCRIPT     FROM SYS_CALENDAR.CALENDAR WHERE DATE=CALENDAR_DATE     UNION   SEL '                   SEL TOP 1000000            '   FROM SYS_CALENDAR.CALENDAR WHERE DATE=CALENDAR_DATE   UNION   SEL   '                 TRIM(COALESCE (CAST(' || COLUMNNAME ||  'AS VARCHAR(300)),' ||'27'xc || '27'xc || ')) ||  ' ||'27'xc ||'|' || '27'xc || '||' FROM DBC.COLUMNS WHERE TABLENA

Small Unix Tips

To find contact_opt_out recrsively inside the directory find . | xargs grep -s OMNI_Balbao_Loads.par ~~ date + "%y-%m-%d" ~~ to manipulate string in command use brackets preceeded by dollar sign eg: mv filename $(echo filename|cut c-2-100) renames file from fielname to ilename ( because of cut command) ~~ To ridrect error commands use 2> or >& eg sh script.sh 2>errorfile ,sh script.sh >& errorfile ~~ To check space du -sh df /foldername df -h ~~

Few More Queries

--Check query Performance SEL     TOP 5   QueryId  , cast ( QUERYTEXT as varchar ( 5000 ))  , ElapsedTime                    , MaxAMPCPUTime                  , AMPCPUTime                     , MaxAmpCpuTime * NumOfActiveAmps                  ,( MaxAmpCpuTime - ( 1.00 * AmpCpuTime )/ NumOfActiveAmps ) ,( 1 -( 1.00 * AmpCpuTime /( MaxAmpCpuTime * NumOfActiveAmps )))* 100 ,( MaxAmpCpuTime * NumOfActiveAmps )/( AmpCpuTime ) FROM            DBC . QRYLOG WHERE         MAXAMPCPUTIME > 100 ; -- Query to find stats collected about a month ago . Note : Original query picked up from Teradata Forum (As far as memory serves Original Author :Dieter Noeth ) .I have modified it a bit for my convenience . Stats modified query LOCKING ROW FOR ACCESS SEL DISTINCT   'COLLECT STATS ON ' || DATABASENAME || '.' || TABLENAME || ' ;'    -- || 'COLUMN (' || COLUMNNAME ||');' FROM ( SELECT C . DATABASE