OT: SAS, SQL, Python, & R

Submitted by tasnyder01 on May 7th, 2019 at 12:26 PM

Board is seemingly dead, and also seemingly full of comp/data gurus. So.

Which programs do you use, what are your thoughts, and (the age old debate) why is R better than Python?

Some personal background: just moved to a new company that uses SAS like crazy. I'm much more of a SQL or Proc SQL guy. Any tips for how to learn SAS? I'll also have to practice some R, so any tips on good practice sites for R welcome too.

mGrowOld

May 7th, 2019 at 12:33 PM ^

This is how the thread looks to me....

OT:SAS,SQL,Python

和R  由tasnyder01於2019年5月7日下午12:26提交 添加新評論 董事會似乎已經死了,而且看似充滿了comp / data大師。所以。 你使用哪些程序,你有什麼想法,以及(古老的爭論)為什麼R比Python更好? 一些個人背景:剛搬到一家使用SAS瘋狂的新公司。我更像是一個SQL或Proc SQL人。有關如何學習SAS的任何提示?我還必須練習一些R,所以R的優秀練習網站的任何提示也歡迎。

OT:SAS,SQL,Python hé R  yóu tasnyder01 yú 2019 nián 5 yuè 7 rì xiàwǔ 12:26 Tíjiāo tiānjiā xīn pínglùn dǒngshìhuì sìhū yǐjīng sǐle, érqiě kàn shì chōngmǎnle comp/ data dàshī. Suǒyǐ. Nǐ shǐyòng nǎxiē chéngxù, nǐ yǒu shé me xiǎngfǎ, yǐjí (gǔlǎo de zhēnglùn) wèishéme R bǐ Python gèng hǎo? Yīxiē ge rén bèijǐng: Gāng bān dào yījiā shǐyòng SAS fēngkuáng de xīn gōngsī. Wǒ gèng xiàng shì yīgè SQL huò Proc SQL rén. Yǒuguān rúhé xuéxí SAS de rènhé tíshì? Wǒ hái bìxū liànxí yīxiē R, suǒyǐ R de yōuxiù liànxí wǎngzhàn de rènhé tíshì yě huānyíng.

DualThreat

May 7th, 2019 at 1:15 PM ^

This reminds me of a how I became an engineer...

I was admitted to the UM college of LS&A out of high school.  Toward the end of my freshman year I found out that LS&A requires its students to take 4(!) years of a foreign language.  I did not want to do that so I talked to my advisor and asked if computer language could count as foreign language.  They checked on it but, alas, came back and said no.  Finding out that the college of engineering didn't require proficiency in a foreign language, I applied and was accepted there starting my sophomore year.  I've been an engineer ever since.

If I had mGrowOld's post here way back in 2001, I might have submitted it as evidence that computer language should count as foreign language!

SoullessHack

May 7th, 2019 at 2:15 PM ^

Same. 

I played a game with myself where I tried to figure out what all the acronyms meant. For SAS I got “Special Air Service” (which I think is some kind of UK special forces). And then I read “Python” and thought “Monty Python,” which is ALSO UK-related so I decided that I was definitely right on Special Air Service and then I lost interest and gave up.

Turns out I was not right.

mGrowOld

May 7th, 2019 at 12:55 PM ^

I plan on stealing your analogy because it's great.

I listened to my 14 year old argue to the death with his friend that Snowboarding was better than skiing and my response was this was like argueing Chocolate ice cream is better than Vanilla.

But yours is better IMO.

Bo Glue

May 7th, 2019 at 1:51 PM ^

Glad you like it. I think your own analogy might be more apt, though. :)

I was going more for "use the right tool for the job" than "whatever floats your boat".

That said, they are different ways to enjoy the mountain, so I guess in a way they are tools.

Grampy

May 7th, 2019 at 2:02 PM ^

No tool is going to make up for a lack of understanding of the problem space, this is particularly true of model abstraction tools (I’m looking at you, MATLAB) as applied to simulation software which must work in real-world applications.  It’s good for academics, though.

JPC

May 7th, 2019 at 12:44 PM ^

I've used them all, though most of my industrial experienced is with SAS on huge (multi TB) data sets. Using "Proc SQL" to implement SQL commands in SAS was super helpful before I got good at SAS. The Little SAS Book should be your number 1 starting point with SAS. It won't teach you everything that you need to know, but it will get you to the point where you can figure anything out via google. 

As an academic, I use STATA for stats and MATLAB for simulation. The only time I'd ever use R, or more likely S Plus, would be if I was doing a time series analysis. 

DavidP814

May 7th, 2019 at 1:12 PM ^

I learned SAS over a decade ago, so my pain/frustration has been dulled over the years, but watching others try to learn it now, there seems to be a very high learning curve.  SQL is really intuitive, but gets wonky when it's used for more than queries or simple data manipulation.

I used MATLAB and Python sparingly in undergrad, but I've never seen either language used in a professional setting.

wolfman81

May 7th, 2019 at 12:58 PM ^

 

I mean, SQL isn't really a Turing complete programming language.

That being said, I'll pile on with some of the others here, each has their use case.  When I was a broke grad student, I realized I'd be doing some stats-heavy research, so I learned R.  It has been very useful for me.  My "new" fave:  Jupyter Notebooks.

NittanyFan

May 7th, 2019 at 1:09 PM ^

The best skill of all: the ability to analyze data while ALSO being able to talk about it meaningfully and correctly to a non-technically inclined audience.

I'm in the data analytics field: I wish I saw more of the skill above.  There are plenty of coders.  There are not enough consultants.

For the record, IMO: R > SAS > Python

Zarniwoop

May 7th, 2019 at 4:05 PM ^

The only time a consultant adds value is if your entire staff is incompetent and/or you have to blow your budget quickly so you don’t lose it.

Only exception really is in extremely specialized projects or when you're literally paying someone to specifically design a new process (which people will follow resentfully for a time and then abandon).

uminks

May 7th, 2019 at 1:17 PM ^

I use java with google scripts a lot for most of our databases at work. My favorite and most flexible is Perl (especially for regular expressions). My work uses a lot python on our server based applications.

gruden

May 7th, 2019 at 2:04 PM ^

Regex... HISSSSS!  Ugh I hate that.

Did do plenty of Perl myself for a few years to talk to LDAP and parse/build LDIF files.  Now it's mostly Powershell and a little Python thrown in for fun to talk to my systems and make them do my bidding. :)

Blue@Petoskey

May 7th, 2019 at 1:26 PM ^

It depends on the industry.

Health insurance companies love SAS.  P&C love R.  Both love SQL to an extent and everyone else is falling in love with Python it seems.

I am overgeneralizing but as an actuary that has applied to and researched many insurance carriers and other companies, these preferences are what I had to ask for.

That being said, I learned SAS through The Infinite Actuary and they do having learning modules for R.  I wouldn't go the same route if you are not an actuary but would ask your colleagues in this new position what they used to learn SAS.

uminks

May 7th, 2019 at 1:29 PM ^

I remember when all we had in the 80s for my eginerring classes was FORTRAN 77.  Some of the professors were using C.

Qseverus

May 7th, 2019 at 2:14 PM ^

Brings back memories. Pascal was my first programming class back in the 80s. Still have my textbook. Used to mess around with Turbo Pascal from Borland. At work I ended up doing a lot of database programming with dBase and Foxpro.

ST3

May 7th, 2019 at 3:25 PM ^

I had a summer job programming in dBase III. dBase IV was just coming out. It was a disaster.

Starting in the mid-1980s, several companies produced their own variations on the dBase product and especially the dBase programming language. These included FoxBASE+ (later renamed FoxPro), Clipper, and other so-called xBase products. Many of these were technically stronger than dBase, but could not push it aside in the market.[4][5] This changed with the disastrous introduction of dBase IV, whose design and stability were so poor that many users switched to other products.

dBase IV, the New Coke of database software programs.

https://en.wikipedia.org/wiki/DBase

BlueMan80

May 7th, 2019 at 1:56 PM ^

While at Michigan for my engineering undergrad, I learned  IBM 360 assembler, Fortran, PL/C, APL, Algol, and Snobol.  I learned C when I joined AT&T Bell Labs.  I dabbled in Java to help my son with a high school programming course.  I learned some HTML to overcome the deficiencies in a template based web site building tool.

I've always wondered if I should try my hand at modern coding, but have wondered what would make sense to try.  This thread has at least pointed me at things to start Googling.

BJNavarre

May 7th, 2019 at 1:48 PM ^

R is kinda a garbage programming language. It's big advantage over python was it's data analysis and visualization libraries. I don't think it has much of an advantage in that regard anymore, but I'm not doing PhD level statistical work either. No sane person is using R for anything else, where as Python is easy to learn and one of the most popular programming languages out there. If I were doing serious data analysis work, I would start with Python.

SQL is fine for querying relational databases.

I know nothing about SAS.

For the record, I'm coding in C# and typescript at my job for most of the day. It's been a few years since I've seriously used R or Python, so take my hot programming take for what it's worth.

ColoradoBlue

May 7th, 2019 at 4:13 PM ^

This.  R is great at data analysis and visualization, but Python is closing that gap pretty quickly AND Python can be used for general development.  R is not a general purpose language.  

And I think all analysts should have solid SQL chops unless all of your data sources are flat files.

bossmania

May 7th, 2019 at 5:22 PM ^

I started with SPSS because that's we used at my first job, but I switched to Python about a year ago. I love how easy it is to use scikit-learn, Keras and Tensorflow. Take some time to really learn matplotlib and numpy and you've got everything you need to do machine learning. Jupyter notebook is nice too, it lets you "tell a story" for the suits in the same document where you write your code. Next step for me is moving everything to Azure (not my choice...) and leveraging Spark to speed things up.

If I were starting now I'd definitely start with Python, it hides a lot of complexity away from the user and makes it super easy to get started. On top of that you can use Python to do pretty much anything, as opposed to R or SPSS which are kind of limited to statistics stuff.