Home > Uncategorized > MySchool 2.0… I’m ready for another scraping challenge.

MySchool 2.0… I’m ready for another scraping challenge.

February 12, 2011 Leave a comment Go to comments

A quote from Parliament, (gov source) (sorry, not up on OpenAustralia yet):

Ms SMYTH (La Trobe) (3:15 PM) —My question is to the Minister for School Education, Early Childhood and Youth. How will the My School website deliver greater transparency and information to parents?
Mr GARRETT (Kingsford Smith) (Minister for School Education, Early Childhood and Youth) —I thank the member for her question. The fact is that My School has transformed community understanding of school performance, providing the community and parents with information about important areas of schooling, including literacy and numeracy—information that was never before available other than for education bureaucrats and officials. With some 4.6 million visits to the My School site since its launch, it is clearly a matter of great interest to all Australians.

Now My School 2.0 will take transparency to a new level, with significant new features. Importantly, financial data on each school will be reported for the first time to everyone to provide a clear picture of the resources that are provided to schools to support the education of students. The collection of this financial data is a complex task and to ensure that it is robust information and comparable, the Australian Curriculum Assessment and Reporting Authority—ACARA—commissioned a detailed validation process undertaken by leading accounting firm, Deloitte. Later in November Deloitte identified an anomaly in the information collected, which could lead to a misstatement of recurrent income for independent schools. Deloitte recommended to ACARA that further validation and consultation take place.

On 2 December I announced that My School 2.0 would only be launched after this further validation, and consultation with independent schools, had been undertaken. I also wrote to the ACARA chair requesting a detailed timetable on how outstanding school data issues would be resolved. I can inform the House that over summer ACARA has liaised extensively with individual schools and consulted with the Independent Schools Council and relevant state and territory associations of independent schools because I wanted to make sure that impacted schools had been contacted by ACARA and that they had had time to check their data and understand how the data will be used and reported.

I can advise that every independent school in the country has been contacted by ACARA by email. Follow-up contact by telephone has been made when requested and as required and over 900 independent schools now have school finance data reports that have been quality assured by ACARA, Deloitte and my department. These schools are now being given the opportunity to review what their My School finance page will look like because My School 2.0 will also include an enhanced ICSEA—Index of Community Socio-Educational Advantage—where the methodology has been improved to provide a more accurate direct measure based on parent education and occupation. It is a case of making good data even better.

The change to ICSEA methodology has led to some changes in ICSEA values and less than four per cent of all schools did request a review of their ICSEA value. These schools were given the opportunity to provide more relevant data or information on changed school context. Now schools and systems have had nearly three months to provide the additional data in support of their review request. Information has been considered by the ICSEA expert panel to be satisfied that each school’s value is robust based on the most accurate data available.

So today, I am pleased to advise that the further work I asked ACARA to do in relation to school data has almost been completed and that My School 2.0 will be ready for release on 4 March. With My School 2.0 ready for release on 4 March the government will deliver an important and fundamental reform—one acknowledged by the Leader of the Opposition as worthy of the name ‘reform’. Importantly, it is a reform that empowers parents to influence the quality of their child’s schooling and empowers education ministers, for the first time, with a national dataset to target school improvement. Importantly, it is a reform that underpins this government’s substantial reforms in the area of education. We want to provide every school in Australia with the possibilities of a great education.

So, come March 4, I look forward to writing another scraper to get this data in a usable exchange format, and out of the unusable HTML mess.

p.s. data dumps and scraping code for the existing site are currently up at https://github.com/andrewharvey/myschool.

Categories: Uncategorized
  1. A Slee
    February 28, 2011 at 10:12 am

    I would be very interested in looking at a heat map of Newcastle and the Hunter Central Coast region when you have looked at this.


  2. anonymous
    March 4, 2011 at 11:47 am

    i would like to see a scientifically rigorous effort at trying to scrape it as there are additional controls in place, including legal ones. I’m not sure you’ll be legally allowed to publish the resulting data sets, but from a technical point of view, scraping is much harder this time around.

    please consider a good write up, including paths not taken, challenges, as well as what worked.

    • Andrew Harvey
      March 19, 2011 at 8:19 pm

      Yeh, I spoke to soon. I won’t be trying again since they seem to require a capcha.

  1. No trackbacks yet.

I don't read comments anymore due to an increase in spam comments. If you want to get in touch please send me an email (see tianjara.net for details).

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: