Statistical Machine Learning and Visualization

At the Georgia Institute of Technology

Browsing Posts tagged Matlab

When computing entropy, log-likelihood, etc. it is common to treat 0*log(0)=0 as per the interpretation of an event of measure zero. Contrary to this “statistical convention,” IEEE floating-point defines 0*-Inf as NaN (not-a-number). Moreover IEEE defines NaN+7.2 as NaN (where 7.2 could be any non-NaN numeric). In Matlab this is particularly annoying when one wishes […]

Matlab Central download Unfortunately Mathworks has not designed Matlab to behave like the typical Unix interpreter, i.e., there is no #! (“shebang”) support similar to Bash, Perl, or Python. We demonstrate how to set this up using only about 20 lines of Bash (and a small mex program written in C). Note: this is a […]

Found a very nice tutorial on Matlab’s parallel computing toolbox (formerly named the “Matlab Distributed Computing Envoronment”): http://www.bu.edu/tech/research/training/scv-software-packages/pct/

Matlab Central download SHAREDMATRIX Allows certain Matlab objects (see below) to be shared between multiple Matlab sessions, provided they have access to the same shared memory resources, i.e., the processes are on the same physical system. This program uses shared memory functions specified by POSIX and will not work in Windows nor any other non-POSIX […]

Repmat is not a built-in function in Matlab. It is slow. An alternative is to do “repmat” by matlab indexing, which is faster. x=[1 2 3 4]’ ; x(:, ones(4,1));%repmat(x, [1 4]) ; x=[1 2 3 4]; x(ones(1,4),:);%repmat(x,[4 1]) A more detailed tutorial is found here.

To the best of my knowledge, Matlab doesn’t fully exploit symmetric matrices. An easy fix is to (mis-)use pdist! Suppose we wish to compute the Gramian matrix G=X*X’. O = pdist(X,@(a,b)a*b’); % compute the off-diagonal part; a vector D = sum(X.*X,2); % compute the diagonal part; a vector G = squareform(O)+diag(B); % combine; a matrix […]

In a different post, I discussed a taxonomy of data types. The simplest data type is probably one dimensional numeric variable. In this post I explore several standard visualization of one dimensional numeric data. I demonstrate these concepts using R. We thus assume that our data are numbers sampled iid (independently identically distributed) from a […]

Matlab and R are two popular languages for data analysis and visualization. The similarity between the two languages is high. Both are interpreted languages that run in a shell-like environment (while also allowing to run scripts or functions written off-line). Both tend to be slow if your code contains many loops but are fast when […]

The script kickoff (also available in the dillon svn branch) can be used in running Matlab tasks in parallel in background. The scripts are written by Josh and tested by Mingxuan. Type the following in the command line: kickoff 1 20 myMatlab.m The command takes 3 arguments, the first one is tryno (try number e.g.1), […]

Upon a little reflection, it occured to me that MATLAB does in fact have the ability for a rudimentary native hash-table: terms = { ‘price’ ‘cents’ ‘govern’ ‘billion’ ‘company’ ‘state’ ‘economy’ ‘stock’ }; ids = num2cell(1:length(terms)); dict = reshape({terms{:};ids{:}},2,[]); dict = struct(dict{:});   dict.(’cents’) % my two cents! Of course one could also use java: […]