Scatterplots: Now showing predictions, acceptance status

October 18th, 2009
No Gravatar

Tonight I added a couple of features to the scatterplots (scatter charts / scattergrams) that I introduced in my last post. The two new variables that you can now access are: acceptance status and prediction.

Let’s say you wanted to know how accurate our prediction engine is within one particular range of predictions. For example, you want to see what really happens when we claim someone has around 80-100% chances. Set the axes to Prediction, add a small amount of jitter, and you can now get a sense of how many miscategorizations we’ve made in that region. Take a look at Boston College to see an example.

You can now graph our predicted probability of acceptance on scatterplots.

You can now graph our predicted probability of acceptance on scatterplots.

Clearly we do a pretty good job at Boston College. For one, you can instantly see that the blue (accepted) applicants cluster on the right side with high predictions, while the red (rejected) cluster on the left. Additionally, look at the mean (average) lines. The mean prediction for accepted members was 83.3%, while that for rejected members was 35.6% – a very large difference.

Another way to visualize the data would be to set one axis to Accepted, and another to Prediction, so you get perfect separation between the blue (accepted) and red (rejected), and can perhaps more easily see how well our predictions separate the accepted from the rejected at any particular range of chances.

These updates were requested by Christian Romero; thanks, Christian.

  • Share/Save/Bookmark

New college admissions tool: Interactive flash scatterplots

October 4th, 2009
No Gravatar

We have rolled out our interactive flash scatterplots (also known as scattergrams), available on every college page under the ‘My Analysis’ tab.

These graphs display the accepted and rejected applicants scattered across a 2D canvas according to the variables that you choose. For example, you might look at Unweighted GPA & SAT, or Instate & Average AP Score. To get started with this new tool, see Cornell’s scatterplots.

For any given SAT score, valedictorians appear more likely to get into Cornell than non-valedictorians.

For any given SAT score, valedictorians appear more likely to get into Cornell than non-valedictorians.

Because there are many, many overlaps, you can set a level of jitter, so each point floats near its true value. For example, if you look at Unweighted GPA and Valedictorian Status, everyone will clump on top of one another. (You either are a valedictorian, or you aren’t, so there are only 2 slots that you might possibly fit into – hence lots of clumping.) If you set a 20% jitter to Valedictorian Status, things will spread out nicely, so you can see what is really going on.

With your feedback and criticism (please post it here or in the forums), we’ll work on improving the tool. Enjoy!

These display the accepted and rejected applicants on the same canvas. You can choose which dimensions they’ll be displayed against (unweighted GPA and SAT, for example).
  • Share/Save/Bookmark

College Rankings #2: Pitfalls of Various Preference-Based Ranking Methods

August 30th, 2009
No Gravatar

In my previous post, I introduced the new college rankings system that we have implemented. In short, the system ranks colleges based on where their admitted students decide to attend. In this post, I will discuss some of the approaches that might be considered in creating a preference-based ranking. In the next post, I will discuss the preference ranking system that we have implemented.

2009 MyChances.net College Rankings

2009 MyChances.net College Rankings

Yield isn’t enough

The goal of a preference-based ranking system is to capture people’s true preferences and represent them faithfully. To discover people’s preferences, a reasonable place to start might be a college’s yield. Yield is calculated as follows:

yield = (# of students attending) / (# of students accepted)

So how can we use yield to compare two schools? Suppose we match the University of Georgia (#70 on our list) against Pomona (#50 on our list). In this matchup, Georgia’s 55% yield actually beats out Pomona’s 39% yield. More of Georgia’s admitted students end up attending—so Georgia appears to be preferable to Pomona. But there is a problem here: we have no direct evidence that students, given the opportunity to attend either school, would choose Georgia. We simply don’t know what the students who were admitted to both schools would do.

In the abstract, there is another problem with this approach. Imagine that 100 students apply to both Georgia and Pomona. Suppose Pomona accepts 50 of them but Georgia accepts all 100. Now, suppose the 50 rejected from Pomona all decide to go to Georgia, giving it a 50% yield. Suppose, also, that 40 of Pomona’s accepted students also get into Harvard, Yale, or Princeton, and they all go off to those schools. This leaves Pomona with a 20% yield. Going by yield, it appears that Georgia is the preferred college by far—but in reality, all of the students admitted to both Pomona and Georgia who attended one of the two decide to go to Pomona. Yield, in this situation, gives us exactly the wrong answer about which school is preferred over the other!

Each student matters

What can we learn from the failure of yield as a measure of preference? Summary statistics simply don’t tell us enough. We need to drill down to the level of individual students. Only then can we build up a picture of their collective preferences. How can we do this? One approach might simply be to ask them what their preferences are. For example, we could survey a bunch of students applying to college, and ask them to order all of the schools they are considering, from most favorite to least favorite.

This is better; if those 100 people in our previous example honestly represented their preferences, we would probably see that the Pomona was preferred over Georgia. This is the intuitively correct result given our (fake) example. But even this approach isn’t perfect.

Talk is cheap; opinions, cheaper

One problem is that there is no cost associated with ranking a school #1 on your own personal list. Until you actually have to decide which college you are going to attend—and pay tuition to—for the next 4 years, your opinions have no teeth. Let’s say you rank UNC as your #1 school and Duke as your #8 (out of 8), because your Tar Heel family hates those Blue Devils. You apply, and get into both schools. Did I mention that you got a merit scholarship to Duke? All of a sudden, you find yourself attending your supposedly bottom-ranked school. You didn’t lie when you gave us your rankings, but you probably exaggerated how much you preferred UNC over Duke. Furthermore, you didn’t have all of the information that you used to make your decision—such as your merit scholarship—when you reported that Duke was your #8 school.

In general, asking people for their preferences leads to these additional problems:

  • They may give feedback about colleges where their feedback is of questionable value. If someone with a 1.5 GPA says that they rank State U over Harvard, should that hurt Harvard—even though this person almost certainly wouldn’t be given the opportunity to attend there, anyways?
  • They almost certainly give feedback that is based on imperfect information. At the moment where people are making their decision to attend one school out of several that they were admitted to, they have acquired as much information as they think they need to make this huge decision. Beforehand—and, in particular, before they have applied to and been admitted to colleges—their stated preferences may be much more labile.

Understanding these flaws helps flesh out a framework for a powerful-yet-simple preference-based college rankings system: one where students simply report where they were admitted and where they decided to attend. In my next post, I’ll get into some of the details of how to take this information and construct a ranked preference list. I’ll even demonstrate how this approach addresses a common criticism of the currently popular college rankings: that there is no way to truly distinguish between schools closely ranked (e.g., #3 vis-a-vis #5).

Essentially, the problem is that there is no cost associated with ranking a school #1 on your own personal list. Until you actually have to decide which college you are going to attend—and pay tuition to—for the next 4 years, your opinions have no teeth. Let’s say you rank UNC as your #1 school and Duke as your #8 (out of 8), because your Tar Heel family hates those Blue Devils. You apply, and get into both schools. Did I mention that you got a merit scholarship to Duke? All of a sudden, you find yourself attending your supposedly bottom-ranked school. You didn’t lie when you gave us your rankings, but you probably exaggerated how much you prefer UNC over Duke. Furthermore, you didn’t have all of the information that you used to make your decision—such as your merit scholarship—when you reported that Duke was your #8 school.

  • Share/Save/Bookmark

New College Rankings

July 10th, 2009
No Gravatar

Presenting: our new college rankings.

The college admissions landscape is littered with college rankings. In 1983, US News first ranked American universities. Since then, rankings have been a fixture of the college world: they are produced by various businesses (US News, Princeton Review, Forbes, Atlantic Monthly), and heeded by students and colleges alike. To gain advantage, some universities have been alleged to manipulate their own rankings. And, while some of the factors used in the rankings are justifiable (alumni giving rate), some seem to be arbitrary (peer assessment surveys asking other colleges about your college’s ‘faculty dedication to teaching’). Each year, the methodology changes slightly, producing a slightly different list. In the end, the factors that are used to come up with the rankings seem arbitrary; the occasional change in the weighting of each factor, capricious. There is a need for a new approach.

Criteria for a ‘good’ college ranking system

  1. The system should be difficult to game; any ‘gaming’ of the system should actually benefit students. In contrast, consider the allegations that some schools tried to manipulate the US News rankings by encouraging more students to apply in order to decrease their acceptance rate.
  2. The factors measured should be relevant to students. In contrast, what Cornell’s dean thinks about the faculty dedication at the University of Texas may be irrelevant.
  3. The overall procedure for generating rankings should be stable from year to year. In other words, any change in the rankings between 2008 and 2009 should be explained by a substantive change in the underlying factors, not by an arbitrary change in how those factors are weighted.

The MyChances College Rankings

We have implemented the MyChances College Rankings based on revealed student preference. In this system, the college admissions process is treated like a chess tournament. The colleges play matches (which occur when 2 colleges admit the same student). In each match, there is a winner (the college that the student ends up attending) and a loser. The winner gains points; the loser forfeits them. When a high-ranked school beats a low ranked school, the high-ranked school gains few points, and the low-ranked school loses few points. If a low-ranked school beats a high-ranked opponent, it gains more points than if it beat an equally-matched opponent. After playing many games, the colleges that students prefer rise naturally to the top of the rankings.

Does the method of revealed student preference meet the 3 criteria outlined above? I believe it does.

Consider point #1 (gaming the system). Imagine that MIT wanted to beat out Harvard by trying hard to avoid admitting any students that they thought would be admitted to Harvard. They would end up succeeding in a model based on acceptance rate and yield (since their yield would likely increase), but their actual student body would be less qualified. In the revealed preference model, however, they would be less successful. They would not compete head-to-head with Harvard, so would ‘win’ more. But they would be winning against weaker ‘opponents’, earning fewer points for each victory.

For point #2 (relevance), the idea of revealed preference is that it aggregates the sum total of what matters to students – whatever those factors might be. It is likely that students behave rationally (by attending the school that they find most desirable). So long as other students share similar values, then revealed preference rankings will work well in explaining, and even guiding, their decisions.

For point #3 (stability), the tournament style system is simple and straightforward. It is responsive to changes in student preference over time. It does not rely on aggregations of various statistical factors, or college faculty survey results; nor does it depend upon arbitrary weighting of those factors.

The details of the procedure that we use to generate the rankings, and our use of chess-style Elo points, will be explained in a later post. For an academic treatment of a similar college ranking system, I recommend the working paper, “A Revealed Preference Ranking of U.S. Colleges and Universities,” 2005, by Christopher Avery, Mark Glickman, Caroline Hoxby, and Andrew Metrick (free link).

  • Share/Save/Bookmark

Secret preferences revealed: which colleges do students actually choose?

May 12th, 2009
No Gravatar

Today we’re letting everyone in on a sneak-preview of our latest tool: the college cross-admit preference tool. We think it’s a simple but powerful way to see which colleges are most favored by admitted college students.

To use it is simple: type in the names of two colleges that you want to compare (perhaps Florida and Florida State?). You’ll then see which fraction of site members prefers which school. Preference is determined by the relative fraction of members admitted to both schools who end up attending one or the other. For example, if 25% of students admitted to both College A and College B ultimately go to College B, we say they prefer College B over College A. When the results are statistically significant at the 95% level, you’ll see the results lit up in bright colors.

For the hardcore college admissions followers out there, this will remind you of this graphic from a 2006 NY Times article. One difference is that our list isn’t limited to 17 schools; as the data continues to become available, we’ll display this information for all 1700 schools that we track.

Requests? Feedback? Suggestions? Let us know.

  • Share/Save/Bookmark

Adding Threaded Comments for Author Pages on P2 Wordpress Theme

May 10th, 2009
No Gravatar

P2 is one of the sweetest wordpress themes out there (see http://ma.tt/2009/05/how-p2-changed-automattic/), but for certain applications it needs tweaking.  At MyChances.net, we wanted to give each of our members their own blog so they could write about the college application process.  To accomplish this, we needed two additional features: 1) Threaded comments for author pages.  If members are going to feel like they have a legit blog, they need to have user comments on the articles they are writing.  2.)  If the logged in user is the author of the blog, that user should see the “Hi, watcha up to?” quick post box on their blog – NOT just on the main page (aka the “firehose” – thanks Twitter!).

Adding comments to author pages in the P2 wordpress theme is an easy process.  You just need a few minutes and a very small amount of programming knowledge.

Step 1:  Open up wp-content > themes > P2 > entry.php.

Step 2:  find the following line:  if ( ( is_home() || is_front_page() )) $withcomments = true;

Step 3: change the line to   if ( ( is_home() || is_front_page() || is_author() ) ) $withcomments = true;

Step 4: Save entry.php, and make sure that you upload it to your P2 directory.

Step 5: Profit.

If you need comments on the tags page, you can simply change the line to this:

if ( ( is_home() || is_front_page() || is_author() || is_tag() ) ) $withcomments = true;

Essentially you are adding in some logic so that if the page the user is looking at “is an author page” OR “is a tag page” then turn on the withcomments variable so the page is allowed to show the threaded comments feature which we all love.

You can see this functionality live here:

http://www.mychances.net/membership/blog/author/Brent/

and here:

http://www.mychances.net/membership/blog/tag/member-blogs/

Now for our second need, showing the quick post box on author’s own author page (if the logged in user is the author).

Step 1:  Open up wp-content > themes > P2 > author.php.

Step 2:  At the very top of the page, add this line in the php brackets like so:

<?
get_currentuserinfo();
?>

This allows you to access information about the current user, such as their username.  To let php see the username, you can access the data like this:

$current_user->user_login (So if you want to echo out the username of the logged in user, you can put <?=$current_user->user_login?>

Step 3:  Just below the line that has <div class=”sleeve_main”  id=”userpage”> insert this block of code:

<?
if( current_user_can( ‘publish_posts’ ) && $current_user->user_login==$author->user_nicename) require_once dirname( __FILE__ ) . ‘/post-form.php’;
?>

This checks two things.  1) Can the current user even post things.  2) If the current username is equal to that of the author.  If these criteria are met, it will display the “post form” which is the quick post box.  Otherwise, nothing shows up, so it just looks like a blog page for a normal visitor.

That’s it!  Please let me know if you have implementation questions.

Brent

  • Share/Save/Bookmark

Forking Daemons in PHP

April 30th, 2009
No Gravatar

Note: this is from http://bipinb.com/making-php-program-as-daemon.htm . It has been intermittently offline, so I’m archiving it here for future reference.


<?php
include_once('createdb.php');
declare(ticks=1);
$pid = pcntl_fork();
if ($pid == -1) {
die("could not fork");
} else if ($pid) {
exit(); // we are the parent
} else {
// we are the child
}
// detatch from the controlling terminal
if (posix_setsid() == -1) {
die("could not detach from terminal");
}
$posid=posix_getpid();
$fp = fopen("/var/run/process.pid", "w");
fwrite($fp, $posid);
fclose($fp);
// setup signal handlers
 pcntl_signal(SIGTERM, "sig_handler");
 pcntl_signal(SIGHUP, "sig_handler");
// loop forever performing tasks
 $dbobject = new DB();
 $dbobject->getCon();
 while (1) {
// do something interesting here, here i have called a function from other flile called "createdb.php"
$dbobject->CopyCallFiles();
}
 fclose($fp);
 function sig_handler($signo)
 {
switch ($signo) {
 case SIGTERM:
 // handle shutdown tasks
 exit;
 break;
 case SIGHUP:
 // handle restart tasks
 break;
 default:
 // handle all other signals
 }
}
?>
  • Share/Save/Bookmark

UCSD Accidentally Sends Accepted letters to 28,000 Rejected Students

March 31st, 2009
No Gravatar

University of California, San Diego accidentally sent an email to 28,000 students who had already received letters of rejection. School administrators blamed the error on “access[ing] the wrong database.”

Has anyone on MyChances received one of the 28,000 acceptance goof-ups?

 

Read the full story here from the LA Times:  http://latimesblogs.latimes.com/lanow/2009/03/uc-admissions.html

  • Share/Save/Bookmark

Are you suffering from WFCR Syndrome?

March 26th, 2009
No Gravatar

It’s March and thousands upon thousands of high school seniors are suffering from what our users are calling “Waiting for College Reply Syndrome”. Symptoms include uneasiness, constantly wondering if the college actually got your application materials, and feeling like you are the only person who hasn’t heard back from a school. If you’re feeling this right now, please know that you are not alone. Hop onto our forums and commiserate with others in your same situation, go play our Admissions Expert game, and check out the profiles of other students applying to your schools to pass the time (and get transparency into the application decisions coming out!).

  • Share/Save/Bookmark

HostGator causes unannounced downtime

March 17th, 2009
No Gravatar

We’ve been hosted at HostGator for a couple of years now, and have had a good experience with them until last night.

Last night, HostGator made as-yet-undisclosed, and unannounced, security changes to their servers. During this period, they put up ‘Under Maintenance’ signs across all hosted sites. For us, this lasted from about 3:00am-4:00am. Problematically, these signs were not ‘nocached’. Therefore, some visitors are still seeing these pages instead of the current content.

Far more troubling is what occurred to the databases during that time. We started getting database errors around 1:30, and one table even crashed. At that point, we ran a repair command, which was successful. So far, so good. Then, from 3-4am, the ‘Under Maintenance’ signs were put up. Also not really a problem, since no database modifications could be made during that time.

When those ‘Under Maintenance’ signs were turned off, the site was functional again and I assumed we were good to go. We allowed members to continue signing up and making changes to their profiles. We made forum posts. I even did a fair amount of college-name hygiene, replacing less-common college names with their more common nicknames (Virginia Polytechnic Institute and State University ==> Virginia Tech).

This is where I become profoundly disappointed in HostGator: I awoke around noon (hey, I’m on spring break) to find that none of my changes had stuck. In fact, a whole bunch of forum posts that were made after the site came back online were deleted. Most problematically, college profile updates and new member accounts created in the few hours before and after the update were also deleted. From my perspective, there is no good excuse for this. Since HostGator was aggressive enough to replace our site’s content with “Under Maintenance” signs, all maintenance should have been completed while those signs were still up. The site became accessible, but then had its database rolled back to a version from approximately 5 hours prior. Whatever triggered them to do this, I do not know. What I do know is that they exhibited poor business practices last night.

For all of you who modified your profile last night from about 11pm Eastern to 6am (and there were surprisingly many of you), we apologize that your changes were lost.

  • Share/Save/Bookmark