I LOVE STATA
Follow
Find
13.2K views | +3 today
 
Scooped by Mhd.Shadi Khudr
onto I LOVE STATA
Scoop.it!

NEW METHODOLOGIES IN STATISTICS: A DIFFERENT WAY OF STUDYING SPSS

NEW METHODOLOGIES IN STATISTICS: A DIFFERENT WAY OF STUDYING SPSS | I LOVE STATA | Scoop.it

 

A new way of learning statistics with computers is presented.

 

For this purpose an interactive guide, comprising a web site and a SPSS statistics package emulator program, has been developed.

 

These have been prepared for the Biostatistics subject of the 1st year of the University of Granada BA in Biology, though they might be used by anyone who needs to get started in data managing with SPSS.


The web site includes different cases (based on exercises with SPSS) with a theoretical and practical introduction about them and an exercise solved with a SPSS program emulator.

 

Finally another exercise, to be solved by the student with the SPSS program itself, is proposed.


The program emulator aims a student's approach to the SPSS program, guiding him in the interactive execution of a specific exercise.

 

The emulator has enabled only the needed options in order to  properly perform the guided practice, and monitors the user’s  actions, always informing him with warning pop-up windows in case of wrong actions in particular or even with some hints to help the student to find out what his mistake was and to correct it.


In this project we present the Interactive SPSS Self-learning Guide version 1.3 which comprises five different practices:


*the first one is an introduction to the program and Data Editor;

 

*the second one is about Descriptive Statistics;

 

*the third one is about Regression Analysis;

 

*the fourth one, about Probability Calculation;

 

*and the fifth one focuses on Confidence Intervals.

 

Post Image: http://bit.ly/NwNTYd

more...
No comment yet.
I LOVE STATA
Let's Master and Muster Statistics and 'Graphin-it' via Serendipity
Your new post is loading...
Your new post is loading...
Scooped by Mhd.Shadi Khudr
Scoop.it!

♣ Biodiversity Calculators 1: For the Simpson and Shannon Indexes

♣ Biodiversity Calculators 1:  For the Simpson and Shannon Indexes | I LOVE STATA | Scoop.it


This calculator is free to use and is designed for biologists, ecologists, teachers, and students needing to quickly calculate the biodiversity indexes of an ecosystem.


First, enter the number of species, and then enter the name you wish to give the species, if available, and the given populations for each of the species—in any given order.


The script will return the Simpson and Shannon-Wiener values (among almost two dozen others) for the given data...



 Supportive Calculators: 

http://bit.ly/1BEHyMj

http://bit.ly/1FMvBKE

http://bit.ly/1xm8EwR



 On Simpson Index: 

A measure that accounts for both richness and proportion (percent) of each species is the Simpson's diversity index. It has been a useful tool to terrestrial and aquatic ecologists for many years and will help us understand the profile of biofilm organisms and their colonization pattern in the Inner Harbor.


The index, first developed by Simpson in 1949, has been defined three different ways in published ecological research. The first step for all three is to calculate Pi, which is the number of a given species divided by the total number of organisms observed. 

http://bit.ly/1Fd2NfB



 On Shannon Index:

This diversity measure came from information theory and measures the order (or disorder) observed within a particular system. In ecological studies, this order is characterized by the number of individuals observed for each species in the sample plot (e.g., biofilm on a acrylic disc).


It has also been called the Shannon index and the Shannon-Weaver index. Similar to the Simpson index, the first step is to calculate Pi for each category (e.g., species). You then multiply this number by the log of the number. While you may use any base, the natural log is commonly used (ln). The index is computed from the negative sum of these numbers.

http://bit.ly/1Fd2NfB



♣♣ Important Definitions ♣♣

► Biodiversity:

Biological diversity, or biodiversity, is a term that is becoming more and more heard, yet few people really know what it is. There are many definitions for it, but there are two that will be given here.


The first is from the Convention on Biological Diversity, also known as the Rio Summit: "'Biological diversity' means the variability among living organisms from all sources including, inter alia, terrestrial, marine and other aquatic ecosystems and the ecological complexes of which they are part; this includes diversity within species, between species and of ecosystems."


The Canadian Biodiversity Strategy defines it as "…the variety of species and ecosystems on Earth and the ecological processes of which they are a part". It is often simply used as a catch-all term for nature. No definition is perfect; as with life itself, it's a bit nebulous and there are always exceptions.

http://bit.ly/1DEYRDS




► Biodiversity Indices

A Biodiversity Index gives scientists a concrete, uniform way to talk about and compare the biodiversity of different areas. Learn how to calculate this number yourself.

http://bit.ly/1x8p1wF

http://bit.ly/1x8vRSX

http://bit.ly/19F0HrY



Species Richness

Species Richness is the number of species present in a sample, community, or taxonomic group.

Species richness is one component of the concept of species diversity, which also incorporates evenness. 

http://bit.ly/1BWwo9w



 Species Evenness

Evenness is, the relative abundance of species. It refers to the evenness of distribution of individuals among species in a community. In other words, species evenness refers to how close in numbers each species in an environment are. 

http://bit.ly/1BWwo9w

http://bit.ly/1x8vRSX



♣ Supportive Info:

http://bit.ly/1BWqhBT

http://bit.ly/1Fd3KVs

http://bit.ly/1FJK5eE

http://bit.ly/1bjJp45

http://bit.ly/1bjK3id

http://bit.ly/1xFrmtX

http://bit.ly/1BaZwq2



Post Imagehttp://bit.ly/1FK0wI0


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

RASP

RASP | I LOVE STATA | Scoop.it


RASP (Reconstruct Ancestral State in Phylogenies) is a tool for inferring ancestral state using S-DIVA (Statistical dispersal-vicariance analysis),

Lagrange (DEC), Bayes-Lagrange (S-DEC), BayArea, BBM (Bayesian Binary MCMC), BayesTraits and ChromEvol.


 Papers cited RASP


► Papers cited S-DIVA



Post Image


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

What is the difference between the general linear model (GLM)and generalized linear model (GZLM)?

What is the difference between the general linear model (GLM)and generalized linear model (GZLM)? | I LOVE STATA | Scoop.it


"... the general linear model assumes that the ~errors~ are normally distributed, or equivalently that the response variable is normally distributed ~conditional~ on the linear combination of explanatory variables. 

If you look at textbooks or articles on the generalized linear model, the authors will almost certainly talk about the distinction in terms of the link function and error distribution. E.g., OLS linear regression is a generalized linear model with an identity link function and normally distributed errors. Binary logistic regression, on the other hand, is a generalized linear model with a logit link function and a binomial error distribution (because the outcome variable has only two possible values)." 
By Bruce Weaver · Lakehead University Thunder Bay Campus


♒ Highly Supportive: 


 More on the GLM:
In statistics, the generalized linear model(GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.


 More on the GZLM/GLZ:
In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.


 A Note on Genaralised Linear Mixed Models 

☞ Bonus:  
1* Five Extensions of the General Linear Model


2* Thus Spake Wolfram
          ☝


     ☟
☞ N.B. 
     ☝
Make sure that you discern between the above mentioned GLM  and GZLM and the following: 

1- General linear methods (GLMs)
GLMs are a large class of numerical methods used to obtain numerical solutions to differential equations. This large class of methods in numerical analysis encompass multistage Runge–Kutta methods that use intermediate collocation points, as well as linear multistep methods that save a finite time history of the solution. 


2-  Linear Regression
Linear regression is an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variable) denoted X. The case of one explanatory variable is called simple linear regression.

For more than one explanatory variable, the process is called multiple linear regression.This term should be distinguished from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.)


Post Image:  http://bit.ly/1LmiEca

more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

✺ How long will it take me to learn R and use it to do GHLM for the first time?

✺ How long will it take me to learn R and use it to do GHLM for the first time? | I LOVE STATA | Scoop.it


"Opinions surely differ a little on this, but in my opinion (and I suspect the opinions of others on this board), R is your best bet"... 



➽ Supper Supportive: 

http://bit.ly/1ibfvig

http://bit.ly/1nfN6Iw

http://yhoo.it/Q0z5n1

http://bit.ly/1iElUPE



How best to learn R? 

http://bit.ly/1gi9Mqe



➲➲ Time to Tweak it ➲➲

The fastest way to learn a new programming language

http://bit.ly/1lXnLp8



Post ImagE: http://bit.ly/1ibeR4n


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

STATISTICAL THINKING FOR NOVICE RESEARCHERS IN THE BIOLOGICAL SCIENCES

STATISTICAL THINKING FOR NOVICE RESEARCHERS IN THE BIOLOGICAL SCIENCES | I LOVE STATA | Scoop.it


Postgraduate students from non-statistical disciplines often have trouble designing their first experiment, survey or observational study, particularly if their supervisor does not have a statistical background.


Such students often present their results to a statistical consultant hoping that a suitable analysis will rescue a poorly designed study.


Unfortunately, it is often too late by that stage.


A statistical consultant is best able to help a student who has some grasp of statistics.


It is appropriate to use the Web to deliver training when required and that is the mechanism used in this project to encourage postgraduate students to develop statistical thinking in their research.


Statistical Thinking is taught in terms of the PPDSA cycle and students are encouraged to use other Web resources and books to expand their knowledge of statistical concepts and techniques...



Post ImagE: http://bit.ly/1vbodof


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Explore Statistics with R :: Audit this course for free and have complete access

Explore Statistics with R :: Audit this course for free and have complete access | I LOVE STATA | Scoop.it


Learn statistics in a practical, experimental way, through statistical programming with R, using examples from the health sciences. We will take you on a journey from basic concepts of statistics to examples from the health science research frontier. 


Audit this course for free and have complete access to all of the course material, tests, and the online discussion forum. You decide what and how much you want to do...


Do you want to learn how to harvest health science data from the internet? Do you want to understand the world through data analysis? Start by exploring statistics with R!


In this course you will learn the basics of R, a powerful open source statistical programming language. Why has R become the tool of choice in bioinformatics, the health sciences and many other fields?


One reason is surely that it’s powerful and that you can download it for free right now. But more importantly, it’s supported by an active user community. 


In this course you will learn how to use peer reviewed packages for solving problems at the frontline of health science research.


Commercial actors just can’t keep up implementing the latest algorithms and methods.


When algorithms are first published, they are already implemented in R. Join us in a gold digging expedition. Explore statistics with R.


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

GRASS GIS >> Geospatial data management and analysis + visualization +++

GRASS GIS >> Geospatial data management and analysis + visualization +++ | I LOVE STATA | Scoop.it

 

GRASS GIS, commonly referred to as GRASS (Geographic Resources Analysis Support System), is a free and open source Geographic Information System (GIS) software suite used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization.

 

GRASS GIS is currently used in academic and commercial settings around the world, as well as by many governmental agencies and environmental consulting companies.

 

It is a founding member of the Open Source Geospatial Foundation (OSGeo).

 

 

more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Getting staRted with R: An accelerated primer


R is a brilliant piece of software but learning it by yourself, particularly if you have not used command line software before, can be daunting.


This presentation is aimed at introducing beginner to intermediate users of R to some of the basic features of the program (through to programming a basic function).


Experienced R users are also encouraged to attend to help share their knowledge and help the first-timers.

Lyndon Walker has been using R for nearly half his life. He studied and worked at the UniveRsity of Auckland, the birthplace of R, and is currently a Senior Lecturer in Applied Statistics at Swinburne University of Technology in Melbourne...



Bonus:

Tinn-R text editor (http://sourceforge.net/projects/tinn-r/) is helpful in organising and saving your R code.



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Prior distribution for a predicted probability :::: Statistical Modeling, Causal Inference, and Social Science

Prior distribution for a predicted probability :::: Statistical Modeling, Causal Inference, and Social Science | I LOVE STATA | Scoop.it


[ I received the following email:

"I have an interesting thought on a prior for a logistic regression, and would love your input on how to make it “work.”


Some of my research, two published papers, are on mathematical models of **. Along those lines, I’m interested in developing more models for **. . . . Empirical studies show that the public is rather smart and that the wisdom-of-the-crowd is fairly accurate.


So, my thought would be to tread the public’s probability of the event as a prior, and then see how adding data, through a model, would change or perturb our inferred probability of **. (Similarly, I could envision using previously published epidemiological research as a prior probability of a disease, and then seeing how the addition of new testing protocols would update that belief.)


However, everything I learned about hierarchical Bayesian models has a prior as a distribution on the coefficients.


I don’t know how to start with a prior point estimate for the probability in a logistic regression.


Do you have any ideas or suggestions on how to proceed?


I wrote back:....].



Post Image: http://bit.ly/1ivW5TI



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

R and SPSS difference

R and SPSS difference | I LOVE STATA | Scoop.it


>> Question:

"I will be analysing vast amount of network traffic related data shortly. I will pre-process the data in order to analyse it. I have found that R and SPSS are among the most popular tools for statistical analysis. I will also be generating quite a lot of graphs and charts. so I was wondering what is the basic difference between these two softwareS. 


I am not asking which one is better. I just wanted to know what are the difference in terms of workflow between the two besides the fact that SPSS has a GUI. I will be mostly working with scripts in either case anyway so I wanted to know about the other differences."



>> Answer:

"I work at a company that uses SPSS for the majority of our data analysis, and for a variety of reasons - I have started trying to use R for more and more of my own analysis. Some of the biggest differences I have run into include:


1- Output of tables - SPSS has basic tables, general tables, custom tables, etc that are all output to that nifty data viewer or whatever they call it. These can relatively easily be transported to Word Documents or Excel sheets for further analysis / presentation. The equivalent function in R involves learning LaTex or using a odfWeave or Lyx or something of that nature.


2- Labeling of data --> SPSS does a pretty good job with the variable labels and value labels. I haven't found a robust solution for R to accomplish this same task.


3- You mention that you are going to be scripting most of your work, and personally I find SPSS's scripting syntax absolutely horrendous, to the point that I've stopped working with SPSS whenever possible.


R syntax seems much more logical and follows programming standards more closely AND there is a very active community to rely on should you run into trouble (SO for instance).


I haven't found a good SPSS community to ask questions of when I run into problems.


Others have pointed out some of the big differences in terms of cost and functionality of the programs. If you have to collaborate with others, their comfort level with SPSS or R should play a factor as you don't want to be the only one in your group that can work on or edit a script that you wrote in the future..."



Highly Supportive:

http://bit.ly/1etqJZy

http://bit.ly/1hbqozI

http://bit.ly/1mwlQW6

http://bit.ly/NuKSIG



Post ImageE: http://bit.ly/1hbpa7P


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

On the Go: 3 Interquartile Calculators

On the Go: 3 Interquartile Calculators | I LOVE STATA | Scoop.it


Interquartile range (IQR) is the difference between the third and the first quartiles in descriptive statistics.


Make use of this free online calculator to find the interquartile range from the set of observed numerical data (values).



Supportive:

http://bit.ly/13R6scm

http://bit.ly/19xisUO



Post Iamge: http://bit.ly/150q7gh

 

more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

☂ Will 2014 be the Beginning of the End for SAS and SPSS ☁?

☂ Will 2014 be the Beginning of the End for SAS and SPSS ☁? | I LOVE STATA | Scoop.it


Learning to use a data analysis tool well takes significant effort, so people tend to continue using the tool they learned in college for much of their careers.


As a result, the software used by professors and their students is likely to predict what the next generation of analysts will use for years to come.


The use of most analytic software is growing rapidly in academia. The only one growing slowly, very slowly, is Statistica.


While they remain dominant, the use of SAS and SPSS has been declining rapidly in recent years...



Relevant:

http://bit.ly/1geKoh5


☁ ☂ ☁ Forecast:

Will 2015 be the Beginning of the End for SAS and SPSS?



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

On the Infinite Monkey Theorem (IMT)

On the Infinite Monkey Theorem (IMT) | I LOVE STATA | Scoop.it


The Infinite Monkey Theorem (IMT) is a proposition that an unlimited number of monkeys, given typewriters and sufficient time, will eventually produce a particular text, such as Hamlet or even the complete works of Shakespeare.

 

The reasoning behind that supposition is that, given infinite time, random input should produce all possible output.The Infinite Monkey Theorem translates to the idea that any problem can be solved, with the input of sufficient resources and time.

 

That idea has been applied in various contexts, including software development and testing, commodity computing, project management and the SETI (the Search for Extraterrestrial Intelligence) project to support a greater allocation of resources -- often, more specifically, a greater allocation of low-end resources -- to solve a given problem.

 

The theorem is also used to illustrate basic concepts in probability. In 2002, researchers at Plymouth University in the United Kingdom tested the theorem with six crested macaques in a cage with a computer.The monkeys hit the machine with a rock and urinated on it; when they typed, it was mainly the letter "s." However, it should be noted that neither the number of monkeys nor the time allowed for the experiment were infinite.

http://bit.ly/188Unbi

 

 

In 2011, American programmer Jesse Anderson created a software-based infinite monkey experiment to test the theorem. Anderson used his own computer, working with Amazon Elastic Compute Cloud (Amazon EC2) and Hadoop.

 

The virtual monkeys were a million small programs generating random nine-character sequences. When any sequence matched a string of Shakespearean text, that string was checked off. The project finished the complete works in 1.5 months.

 

The Million Monkey Project was mostly just for fun, and did not really replicate the theorem's scenario. Nevertheless, Anderson's methods could potentially be applied to real-world problems, such as DNA sequencing.

 

In the early 20th century, Émile Borel, a mathematician, and Sir Arthur Eddington, an astronomer, used the Infinite Monkey Theorem to illustrate timescales implied within statistical mechanics.

 

In popular culture, the theorem has appeared in many works, including Russell Maloney's short story, "Inflexible Logic," Douglas Adam's "Hitchhiker's Guide to the Galaxy" and an episode of the Simpsons.

 

The IETF's Network Working Group applied the concept in their Infinite Monkey Protocol Suite (RFC 2795), in one of their famous April 1 documents.

 

  

Randomization in Practice: Man vs Monkey

Randomization in Practice: Minitab vs Man

http://bit.ly/1Cvh8Sf

 

 

Hands on Maths:

http://bit.ly/1xKBEZs

http://bit.ly/1CKAjVw

http://bit.ly/1CvaoDL

http://bit.ly/1ECkm5l

http://bit.ly/1BS9LVG

http://bit.ly/1y9i5cl

 

  

IMT in the age of Cheap Super Computers:

Crazy as it seems, the infinite monkey theorem can be proved using basic probability (the trick is having either an infinite number of monkeys or an infinite amount of time, or both). What you could not do, of course, was experimentally verify the monkey theorem.

But that was before cheap supercomputers.

http://n.pr/1gWF84b

http://wrd.cm/1ySOWG

 

 

IMT in simulation:

http://bit.ly/15xDpS4

http://bit.ly/1upFZje

 

 

IMT Demonstrated on Wolfram:

http://bit.ly/1Jijxl5

 

 

Our Own Molecular Monkeys

http://bit.ly/1CZQJK9

 

 

IMT Generator:

http://pixelspread.com/chance/

 

 

> Note on the margin

The Infinite Monkey Theorem may be referred to semi-seriously when justifying a brute force method; the implication is that, with enough resources thrown at it, any technical challenge becomes a one-banana problem. This argument gets more respect since Linux justified the bazaar mode of development....

http://bit.ly/1EaPXhg

 

> Another one in passing:

http://bit.ly/1yDwvpT

 

  

Supportive & Useful:

http://bit.ly/15mpw96

http://bit.ly/15xFDkF

http://bit.ly/1AZPSqv

http://bit.ly/1yDxk1X

http://bit.ly/1AZPmbO

http://bit.ly/1EChXb1

http://bit.ly/1AZQ4G0

http://bit.ly/1t7NP5W

http://bit.ly/1ECj0ro

http://bit.ly/1CmC5y4

http://bit.ly/15mBqj7

http://bit.ly/15DZgbq

http://bit.ly/1BS83Uj

http://bit.ly/1yN14tE

http://bit.ly/1GIGe3U

http://bit.ly/1CZRKBX

 

 

Interesting stuff:

http://bit.ly/1odHSM8

 

 

 

Bonus:

Here comes Python

http://bit.ly/1yZaHqR

 

Solving the Shakespeare Million Monkeys Problem in Real-time with Parallelism and SignalR

http://bit.ly/1ySX6hB

 

 

 

Final Remark:

Is that so?

http://dpo.st/1yNaWnp

 

Still the Infinite Question is: Can Monkeys write Shakespeare?

http://bit.ly/1Bllolf

 

 

Post ImageE ?: )Infinitely ape-like monkey or monkey-like ape(?: http://bit.ly/1CvvPEN


Mhd.Shadi Khudr's insight:


This is what I am talking about:

http://bit.ly/1wsLYDS


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

⌘ To Nest or Not to Nest... That is the Question

⌘ To Nest or Not to Nest... That is the Question | I LOVE STATA | Scoop.it


 Definitions

In data structures, data organizations that are separately identifiable but also part of a larger data organization are said to be nested within the larger organization. A table within a table is a nested table. A list within a list is a nested list


In research design, Nested designs, also known as hierarchical designs. Nested designs are used when there are samples within samples.


In other words, the nested is a design in which levels of one factor (say, Factor B ) are hierarchically subsumed under (or nested within) levels of another factor (say, Factor A ). As a result, assessing the complete combination of A and B levels is not possible in a nested design. 



 Cross vs Nested Factors

Two factors are crossed when every category of one factor co-occurs in the design with every category of the other factor. In other words, there is at least one observation in every combination of categories for the two factors.


 A factor is nested within another factor when each category of the first factor co-occurs with only one category of the other. 


If you’re not sure whether two factors in your design are crossed or nested, the easiest way to tell is to run a cross tabulation of those factors. 


In a nested design, each subject receives one, and only one, treatment condition. 


The major distinguishing feature of nested designs is that each subject has a single score. The effect, if any, occurs between groups of subjects and thus the name BETWEEN SUBJECTS is given to these designs.


The relative advantages and disadvantages of nested designs are opposite those of crossed designs.

  • First, carry over effects are not a problem, as individuals are measured only once.
  • Second, the number of subjects needed to discover effects is greater than with crossed designs. 



In a crossed design each subject sees each level of the treatment conditions. In a very simple experiment, such as one that studies the effects of caffeine on alertness, each subject would be exposed to both a caffeine condition and a no caffeine condition.


The distinguishing feature of crossed designs is that each individual will have more than one score. The effect occurs within each subject, thus these designs are sometimes referred to as WITHIN SUBJECTS designs.

Crossed designs have two advantages.


  • One, they generally require fewer subjects, because each subject is used a number of times in the experiment.
  • Two, they are more likely to result in a significant effect, given the effects are real. 


  Nested vs non-Nested
Nested means here that all terms of a smaller model occur in a larger model. This is a necessary condition for using most model comparison tests like likelihood ratio tests.

 

In the context of multilevel models I think it's better to speak of nested and non-nested factors. The difference is in how the different factors are related to one another. In a nested design, the levels of one factor only make sense within the levels of another factor.


Non-nested factors is a combination of two factors that are not related.



Examples

1- In horticulture, for example, an investigator might want to compare the transpiration rates of five hybrids of a certain species of plant. For each hybrid, six plants are grown in three pots, two plants per pot. At the end of the growth period, transpiration is measured on four leaves of each plant. Thus,  leaves are nested within plants which are nested within pots that is nested within hybrids. 



>> Important Note >>

Random effects are random variables, while fixed effects are constant parameters. Being random variables, random effects have a probability distribution (with mean, standard deviation, and shape). In this respect, random effects are much like additional error terms, like the residual, e.



2- The effect of landscape complexity on aphids and on their natural enemies was analysed using mixed-effects models, in which we included landscape sector and field (nested within landscape sector) as random factors to account for the non-independent errors in our hierarchically nested designs... 



⌘ Formulae in R  




⌘ Supportive

  Nested ANOVA: Use nested ANOVA when you have one measurement variable and more than one nominal variable, and the nominal variables are nested (form subgroups within groups). It tests whether there is significant variation in means among groups, among subgroups within groups, etc. 


 Nested ANOVA models in R 


 ANOVA: Split Plot and Repeated Measures 



  Bonus   


✫✫ Nested Analysis & Split Pot Designs  

 

✫✫ Nested Analysis as a Mixed-Method Strategy for Comparative Research



✔✔✔ Super Succinct Info 



Post Image


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

✎ VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R

✎ VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R | I LOVE STATA | Scoop.it


Visualization of orthogonal (disjoint) or overlapping datasets is a common task in bioinformatics.


Few tools exist to automate the generation of extensively-customizable, high-resolution Venn and Euler diagrams in the R statistical environment.


To fill this gap the authors of this paper introduce VennDiagram, an R package that enables the automated generation of highly-customizable, high-resolution Venn diagrams with up to four sets and Euler diagrams with up to three sets.



Highly Supportive: 



 What is Venn Diagram?

A Venn diagram is an illustration of the relationships between and among sets, groups of objects that share something in common. Usually, Venn diagrams are used to depict set intersections (denoted by an upside-down letter U).


This type of diagram is used in scientific and engineering presentations, in theoretical mathematics, in computer applications, and in statistics.


  How to make Weighted Venn diagrams in R with Vennerable

http://bit.ly/1CZoqNd


Venn Diagrams with gplots 

http://bit.ly/1yZ0upn


for Regression

http://bit.ly/1y903xK



 ☟

DanteR?!

http://1.usa.gov/1y91Gvl



♒Bonus I:

Exact and Approximate Area-proportional Circular Venn and Euler Diagrams http://bit.ly/1CmoEeM



♒Bonus II:

VennPlex–A Novel Venn Diagram Program for Comparing and Visualizing Datasets with Differentially Regulated Datapoints

http://1.usa.gov/1yZ39zw



 Who is John Venn?


In passing:


Post Image: http://1.usa.gov/1CmmvzG


more...
Mhd.Shadi Khudr's comment, April 18, 9:29 AM
Kindest regards
Scooped by Mhd.Shadi Khudr
Scoop.it!

A million ways to connect R and Excel

A million ways to connect R and Excel | I LOVE STATA | Scoop.it


In quantitative finance both R and Excel are the basis tools for any type of analysis.


Whenever one has to use Excel in conjunction with R, there are many ways to approach the problem and many solutions.

It depends on what you really want to do and the size of the dataset you’re dealing with. I list some possible connections in the table below.



More on R and Excel Integration:

Supportive: 


Bonus:

RExcel is an addin for Microsoft Excel. It allows access to the statistics package R from within Excel...


The Excel addin RExcel.xla allows to use R from within Excel. The package additionally contains some Excel workbooks demonstrating different techniques for using R in Excel.



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

What is Predictive Analytics?

What is Predictive Analytics? | I LOVE STATA | Scoop.it



Predictive analytics makes predictions about unknown future using data mining, predictive modeling. Process,Software and industry applications of predictive analytics.


Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events.


In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities.


Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions. 


Predictive analytics is used in actuarial science, marketing, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and other fields.



>> Supportive:

http://bit.ly/1o16JTO



>> Bonus:
"It's hard to make predictions, especially when they are about the future" is a quote usually attributed to American baseball-legend"
Yogi Berra

>> Not a good start when discussing predictive analytics...


>> What Are Predictive Analytics?!


>> The Traditional View...


>> Predicting the Present... 


>> Shaping The Future...


>> A Better Approach...


>> Words of Warning...
http://bit.ly/1unDnE5



Post ImagE: http://bit.ly/1rx1m3M


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

A flavour from Indiana: STATS Indiana: Data Everyone Can Use

A flavour from Indiana: STATS Indiana: Data Everyone Can Use | I LOVE STATA | Scoop.it


STATS Indiana focuses on data for actionable use by Hoosier government, business, education, nonprofits, health organizations and anyone needing to understand “how many, how much, how high or low” for their community.


With nearly 1 million page views and more than 300,000 visits each year, STATS Indiana has won multiple awards from national organizations.


Because of its unique state government/public university partnership and its wide-ranging data and tools, it is frequently cited as a “data jewel in Indiana’s crown.”


STATS Indiana has become Indiana’s information utility and the heart of the Information for Indiana data dissemination channel.


It provides convenient access to data for geographic areas in Indiana and across the nation because we think context and the ability to compare areas on all measures is crucial.


The original catalyst for a statewide, digitally accessible database began with the Indiana Business Research Center at Indiana University's Kelley School of Business, but has received major support from the State of Indiana since the 1980s, becoming an outstanding example of the creative partnership that can occur between state agencies and state-funded research institutions.


>> About the Data

The data on STATS Indiana are provided by more than 100 federal and state agencies, along with commercial or private data sources.


The STATS Indiana database powers also powers Hoosiers by the Numbers, the Stats House and dozens of local and regional websites throughout Indiana.


We add value to these data in the form of calculations, graphs, comparisons of time or geography, time series and maps.


At STATS Indiana, timeliness and accuracy are both critical:

  • We work daily to ensure the data on STATS Indiana are updated as they are released from the source agencies — we don’t let new data sit in a queue waiting for a “scheduled” quarterly update. To help users know what to expect and when, we maintain a release calendar.


  • We use both automated and personal quality control checks to insure the data coming into the database are accurate. Over the years, we have established relationships with source providers that attest to our keen-eyed work, alerting agencies (such as BLS and BEA) when there is a problem with their data.


Each topic has a landing page that provides the data as well as metadata. These "About the Data" pages provide the essentials users need, including info on frequency, the specific source agency, geographic coverage, years of availability and any caveats related to the data.


>> About the Data

http://www.stats.indiana.edu/data_calendar/whats_new.asp



>> Special Toolkit

http://www.stats.indiana.edu/tools/index.asp



Post ImagE:

http://bit.ly/1uzJ3KU


more...
Mark E. Deschaine, PhD's curator insight, February 21, 7:32 AM

Every state should have this

Scooped by Mhd.Shadi Khudr
Scoop.it!

StatHat >> Awesome custom stat tracking tool ♫ ♪

StatHat >> Awesome custom stat tracking tool ♫ ♪ | I LOVE STATA | Scoop.it


StatHat is a custom stat tracking tool. One line of code gets you beautiful charts, automatic alerts, and more...


It only takes one line of code. Beautiful charts, automatic anomaly detection. Used by over 6,000 companies...



Beautiful, accurate views of your stats... 


iPhone App: view stats, get push alerts. Free...


Automatic and manual alerts...


Easy to integrate, designed for the full team to use... 


Simple email reports...


Integration with Status Board and Campfire...


Over 6000 companies use StatHat to track custom stats..


Handling 1.3 billion API calls per day...


Get your stat data out...


Born at OkCupid...


Free for 10 stats, $99/month for unlimited usage...


You can use StatHat for free for up to 10 stats. Your free account will never expire...


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

R Programming Course For Frrreee to start on 07/07/2014

R Programming Course For Frrreee to start on 07/07/2014 | I LOVE STATA | Scoop.it


Part of the "Data Science" Specialization »

Learn how to program in R and how to use R for effective data analysis. This is the second course in the Johns Hopkins Data Science Specialization.


In this course you will learn how to program in R and how to use R for effective data analysis.  


You will learn how to install and configure software necessary for a statistical programming environment and describe generic programming language concepts as they are implemented in a high-level statistical language. 


The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. 


Topics in statistical data analysis will provide working examples.



Post ImagE: http://bit.ly/1t0N91m



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

What the Statistics Won't Tell You About Single Mothers ◕

What the Statistics Won't Tell You About Single Mothers ◕ | I LOVE STATA | Scoop.it


There's a small but growing number of women who are single mothers by choice—and the narrative of single motherhood isn't complete without them...


Yet again, single mothers are in the news. The most recent Shriver Report has a list of statistics that make the plight of single motherhood seem quite daunting—numbers that say they are more likely to live with regret and at the height of poverty, struggling so much more than those with partners by their sides...


But the research doesn’t always tell you the full story



The Shriver report

http://shriverreport.org/



Stats on Gingerbread:

Gingerbread works to tackle the stigma around single parents by dispelling myths and labels.

http://tinyurl.com/77bvo7l



Rise of the single-parent family

http://tinyurl.com/o9o8she



 Single Motherhood Increases Dramatically For Certain Demographics, Census Bureau Reports

http://tinyurl.com/llwuqqs



The Mysterious and Alarming Rise of Single Parenthood in America

http://tinyurl.com/mn5nqwl



Children in single-parent families by race

http://tinyurl.com/qbjfrh9



>> Complementary:

http://tinyurl.com/p4w239h



more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

On the margin: Interesting spotlight on the stats of the Largest and Most Comprehensive Acronyms spot (AllAcronyms)

On the margin: Interesting spotlight on the stats of the Largest and Most Comprehensive Acronyms spot (AllAcronyms) | I LOVE STATA | Scoop.it

 

Statistical information about All Acronyms - database of acronyms and abbreviations...

 

1,027,130 acronyms and abbreviations
1,376,812,879 searches served

 

Acronyms and Abbreviations with highest number of definitions...

Longest and/or Most Popular Acronyms:

http://bit.ly/1hhUBOU

 

 

More:

http://sco.lt/82NfdZ

 

 

Post Image: http://bit.ly/IsqsNG

 

more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Beware of three-way ANOVA; The reason Why Graphpad is still reluctant about it?!

Beware of three-way ANOVA; The reason Why Graphpad is still reluctant about it?! | I LOVE STATA | Scoop.it

 

GraphPad Prism does not perform three-way ANOVA, but many have suggested that we add three way ANOVA to a future version.

 

One reason we have been reluctant to add three-way ANOVA to our programs is that it often is much less useful than most scientists hope.

 

When three way ANOVA is used to analyze data, the results often do not answer the questions the experiment was designed to ask.

 

Let's work through an example:

 

The scientific goals and experimental design

A gene has been identified that is required for angiogenesis (growth of new blood vessels) under pathological conditions.

 

The question is whether it also is active in the brain.

 

Hypoxia (low oxygen levels) is known to provoke angiogenesis in the brain.

 

So the question is whether angiogenesis (stimulated by hypoxia) will be reduced in animals created with that gene removed (knocked-out; KO) compared to normal (wild type, WT) animals.

 

In other words,  the goal is to find out whether there is a significant difference in vessels growth in the KO hypoxic mice compared to WT hypoxic mice.  

 

What questions would three-way ANOVA answer?

.

.

.

 

One alternative approach: Two-way ANOVA

.

.

.

 

A better choice? Linear regression?

.

.

.

 

 

**Summary**

 

>> Just because an experimental design includes three factors, doesn't mean three-way ANOVA is the best analysis.

 

>> Many experiments are designed with positive or negative controls. These are important, as they let you know whether everything worked as it should.

If the controls gave unexpected results, it would not be worth analyzing the rest of the data.

Once you've verified that the controls worked as expected, those control data can often be removed from the data used in the key analyses.

This can vastly simplify data analysis.

 

>> When a factor is dose or time, fitting a regression model often answers an experimental question better than does ANOVA.

 

 

>> Highly Supportive:

http://sco.lt/8APnzl

 

 

Post Image: http://bit.ly/18uNKtY

 

more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Power-law♘Distributions in Empirical Data♞V.IMP ☜

Power-law♘Distributions in Empirical Data♞V.IMP ☜ | I LOVE STATA | Scoop.it


A power law is a special kind of mathematical relationship between two quantities. When the frequency of an event varies as a power of some attribute of that event (e.g. its size), the frequency is said to follow a power law.


For instance, the number of cities having a certain population size is found to vary as a power of the size of the population, and hence follows a power law.


The distribution of a wide variety of natural and man-made phenomena follow a power law, including frequencies of words in most languages, frequencies of family names, sizes of craters on the moon and of solar flares, the sizes of power outages, earthquakes, and wars, the popularity of books and music, and many other quantities.

http://bit.ly/1hGO64Y



This page is a companion for the SIAM Review paper on power-law distributions in empirical data, written by Aaron Clauset (me), Cosma R. Shalizi and M.E.J. Newman.


This page hosts implementations of the methods we describe in the article, including several by authors other than us.


Our goal is for the methods to be widely accessible to the community. Python users may want to consider the powerlaw package by Alstott et al.


NOTE: we cannot provide technical support for code not written by us, and we are busy with other projects now and so may not provide support for our own code.



Journal Reference
A. Clauset, C.R. Shalizi, and M.E.J. Newman, "Power-law distributions in empirical data" SIAM Review 51(4), 661-703 (2009). (arXiv:0706.1062)



Highly Supportive:

Fitting Power Law Distributions to Data

http://bit.ly/1hGLV1r


Power Law Distribution: Method of Multi-scale Inferential Statistics

http://bit.ly/1dfEod6

http://bit.ly/1peOryH


Least Squares Fitting--Power Law

http://bit.ly/1kLJu0m



> Further Support:

http://bit.ly/1epAJHF

http://bit.ly/1miRoRP

http://bit.ly/1d3zAqK

http://bit.ly/1j7pt6Y

http://bit.ly/1epBHnj



⚘ Bonus:

Powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions

http://bit.ly/1eVqNSW



By the Bye:

So You Think You Have a Power Law — Well Isn't That Special?

http://bit.ly/1ozoTxP



Post ImagE: http://bit.ly/1d3zcbJ


more...
No comment yet.
Scooped by Mhd.Shadi Khudr
Scoop.it!

Statist

Statist | I LOVE STATA | Scoop.it

 

Easy to use, light weight statistics program.

 

Statist is a small and portable statistics program written in C.


It is terminal-based, but can utilise GNUplot for plotting purposes.


It is simple to use and can be run in scripts.


Big datasets are handled reasonably well on small machines.



Download:

http://wald.intevation.org/frs/?group_id=12


Further Info:

http://bit.ly/1ddeple


Documentation:

http://bit.ly/1lNf7F5


Post Image: http://bit.ly/1iPA5WK


more...
No comment yet.