PDA

View Full Version : would like help saving an entire website


mosman
10-15-2006, 09:38 AM
hey guys,
im a very forunate uni student, through the university website and using my uni login and password i have access to a number of very expensive online resources (medical textbooks and research sites) the subscriptions alone would cost me about $1k a year as soon as i graduate, no i know that you can cache/save websites onto your computer by using the work offline feature of IE7 but i want to know how to take everything under the domain name as i donw really want to be individually going to every page in the site.
is this possible?
Thanks in advance

PrntRhd
10-15-2006, 09:45 AM
Are there RSS feeds on the sites in question?
In Firefox there is a "Down it All" extension.

pangea33
10-15-2006, 10:15 AM
Are there RSS feeds on the sites in question?
I doubt it if the subscriptions run several hundred dollars a year. I haven't seen authenticated RSS yet, but it might be out there.


...i donw really want to be individually going to every page in the site.
Why bother archiving these resources if you won't even go to the trouble of reading them once? Since you're logging in with a username and password, this mass download activity will be logged. If you're not supposed to archive these resources, you'll be leaving a risky trail. If you are allowed to archive them on the other hand, there is probably an easier way such as a pdf file or some other export format.

Budfred
10-15-2006, 11:22 AM
And if you are not allowed to archive them, which is EXTREMELY likely, what you are proposing is stealing and we do not help people steal things here... You are also probably talking about terabytes of info which will probably be updated each month with a few more gigabytes... Is there really a point to even trying to steal that database??

mosman
10-15-2006, 08:56 PM
I doubt it if the subscriptions run several hundred dollars a year.
Why bother archiving these resources if you won't even go to the trouble of reading them once?


ok thanks for the pessimism and assuming im lazy:p

im talking an archive of over 200 medical textbooks check it out for yourself if you don't believe me, its the access medicine website which gives access to harrisons and other really friggin good resources, the subscription for that site alone is $595/yr the other one that comes to mind is md consult which also has a heap of text books but also has access to medline searching which allows me to search all the clinical trials that have been done on almost any medical topic (ie which drug works better in x scenario and what are the side effects) this one ONLY costs $349/yr

as for reading the resources each textbook in hard copy is a HUGE HUGE book which would send me crazy if i even tried to read them all, its on a need to know basis as im a medical student and as i learn about topics i simply look em up and learn as much as i can about each topic.

anyway thats just two of the resources, the uni also has access to heaps more that i use infrequently but i find helpful.

they dont have rss feeds (well they do but to limited content which i am downloading) any ideas? pdf would be difficult unless im not doing it right? i usually use the adobe toolbar in my IE7 bar and just convert current page to pdf and i pretty much get whatever is in front of me and that's all.

help:confused: :confused: :confused:

mosman
10-15-2006, 09:02 PM
And if you are not allowed to archive them, which is EXTREMELY likely, what you are proposing is stealing and we do not help people steal things here... You are also probably talking about terabytes of info which will probably be updated each month with a few more gigabytes... Is there really a point to even trying to steal that database??

ok fair point, but having said that having access to the website as it stands is kosher is it not? therefore what difference does it make that i no longer have access to the new content? it would be stealing if i cracked into the website and then downloaded it all but i am actually allowed to access this info now, yes it would be a lot of info but i think it would be useful as the two websites mentioned (specifically the access medicine site) are pretty much the pinnacle of medical information. Really i dont think this is any worse than ripping an mp3 if you also don't rip mp3's then i can respect your input and hesitation but im not trying to crack the password so i can gain access when i leave uni illegally, i just want what's there now.

and come on do you want your future doctor to not have the best knowledge when hes treating you as a grad? :p

mjc
10-15-2006, 09:22 PM
Most sites frown upon mass downloading of their content. Archiving the pages you are currently viewing is usually considered fine, but not grabbing the whole thing at one go. The best advice, and safest, would be to archive what you are using...or at least think you are going to use in the very near future.

As far as it goes, IEs archiving abilities suck...try Firefox and look for one of the extensions specifically designed for archiving like this (https://addons.mozilla.org/firefox/427/) one.

Budfred
10-15-2006, 10:03 PM
No, I do not rip MP3s and I don't have much respect for people who do...

And no, I do not want my doctor relying on outdated information that he/she took (unlawfully) from a database several years ago... I want my doctor to be up on new developments... I work in healthcare and we have access to what you are describing and more whenever we need it... I suspect that will eventually be true of most medical facilities in the world even if it is not now... Also, I am not really sure you understand what a terabyte is if you so readily dismiss how large this database is...

Rick
10-15-2006, 10:20 PM
I'll add to this one small point
The data is more than likely stored in a database format
Not in html
So the ONLY way to get ALL of the info would be to run searches on Everything possible ( should only take two or three life times )

When you conduct a seach the data is drawn from the database then added to an html form/page

mosman
10-16-2006, 12:41 AM
No, I do not rip MP3s and I don't have much respect for people who do...

... I work in healthcare and we have access to what you are describing and more whenever we need it... I suspect that will eventually be true of most medical facilities in the world even if it is not now... Also, I am not really sure you understand what a terabyte is if you so readily dismiss how large this database is...


couple of things to say to that.
firstly i must respect what you have said so far as at least you are consistent. if you do not rip mp3's for your own use then i commend your morals, i simply do not have the will power to hold myself back if i cant physically see the person who im downloading from, its terrible but it happens i guess.

secondly, yes alot of healthcare providers have access to sites similar to these but in my experiences so far i have only seen limited access to some of the sites, with uni i have unlimited access to ALL of it, something i want to take advantage of.

Thirdly i simply asked an IT q related as to how to do it. i wasnt planning on taking the whole thing as im aware that the whole database would be huge (i am aware taht a terabyte is 10 to the power of 12kb or somewhere in that vacinity but if i learnt the basic prinicple i could then apply it to the textbooks of my interest and i would happily sacrifice up to 100gb for this information for im sure you know that these sites are pretty much unparralled.

If having read all this you still don't want to help me as it doesnt sit well with your morals thats fine i can respect that as that's what makes the world go round, it would be boring if we all thought the same (actually it would be fine if everyone thogut like me :p ) but i still leave the question out there for someone else to help me out with.

Thanking you

Paul Komski
10-16-2006, 01:50 AM
A terabyte is 1000 gig. So about 250 normal DVDs or equivalent. Unless you have extremely fast broadband it is thus likely to take in excess of a month or two of constant downloading to grab just one terabyte.

We assume you have read ALL THE SMALL PRINT of any agreements you have agreed-to in your registration and that such mass downloading is OK in that regard. You WILL leave a trail and as Rick pointed out the spidering will only work for links and not for DB access/searches. Any external links could be problematic plus the fact that such mass downloading degrades servers and could naff off various bodies.

Personally speaking its a futile exercise and a waste of IT resources now for little benefit to you in the future when you should, in any case, be earning enough for the sums involved to be tax-deductible small beer for access to the most up-to-date material.

Another practical point is how you will find and utilise the various bits of information once downloaded. With broadband and good search engines it has become much easier to access the material you want or need on-line as you need it than access it locally unless you have the know-how to setup your own server and search database.

Budfred
10-16-2006, 02:12 AM
mosman,

I believe you are missing the main point here... We will not let people post information here about ripping off a proprietary database, even if it is possible... We are a forum that is here to help people with legitimate problems, not to help people who want to steal using digital means... If you choose to continue to pursue this unethical option, you will need to go to a forum with lower standards than this one... Be warned that a forum that provides that type of assistance may also be booby trapped with malware....

By the way, since you say you are studying to be a doctor, when do they get around to the course on ethics??

mosman
10-16-2006, 04:31 AM
budfred, thanks for the cheap shot. i really appreciate people who use the obvious. obviously if i rip mp3's try to archive websites which i am allowed to use makes me a terrible person and people in africa are going to die because of my actions. there are bigger fish to fry than my actions and im sure that you have never done anything "unethical" so these same rules do not apply to yourself. think outside the square im sure you have done other things not neccessarily digital that many would say the jury is still out on. if you havent and this applies to everyone else on the forum then my apologies for corrupting your sanitised world.

i apologise if i have offended anyone. i have obviously come to the wrong forum. but im a bit sceptical as to just how clean your slate is so to speak budfred. none of my business.

Thanks anyway guys for having a forum and allowing people to express their opinions, keep up the good work.

ps im NOT being sarcastic, i just think this particular forum might not be for me. enjoy it though ;)

Budfred
10-16-2006, 08:58 AM
I agree, this particular forum is probably not for you...

You came here asking our help to do something that is illegal and clearly unethical... You respond to limits placed on your ability to get help with your stated intention to violate law and ethics by questioning my ethics in a vague and insulting manner... I am not commenting on your theoretical ethical lapse, I am commenting on your clear intention...

In my work we hold ethical behavior to be one of the most important components of a career and I can quite honestly say that I have not intentionally done an unethical act at any time... You have not even begun your career and you are already looking at ripping off a tool that many would give a fortune just to have access to... If you opt to return and question my ethics or otherwise cast aspersions on any of our membership, I will personally permanently show you the door... Meanwhile, in case I ever have the misfortune to encounter you in a professional capacity, I hope you learn how important ethics are before you begin working with patients...

And yes, you are being sarcastic and insulting...

mosman
10-16-2006, 10:47 PM
Ahhh my dearest budfred..."clearly unethical" implies to things such as murder and rape. the meer definition of the word ethics is the study of a set of moral imposed by society which determine what is "good" versus what is "bad". ethics is a very grey area. Especially the ethics of digital copying, such a thing has not been around long enough to decied wether a whole society decides tis wrong or right. dont forget that there are numerous confounding factors to something like this. so you are only showing your ignorance when speaking so confidently about such things.

I tried to end my relationship with this forum in a friendly "whoops i guess i came to the wrong place" manner but you had to come back and throw in your two cents and make a personal attack. i tried to say thats fine i respect your BS but you didnt want any of it and attacked my personal set of morals because they didnt match up to yours. sorry but that's pretty close minded and very petty of you.

As for my proffesional career, thats none of your business and it should suffice to say that your judgmental behaviour in itself is foolsih, how can you make a call on someone's entire ethical approach to life and how it will impact on their careers based on 5 or 6 posts? you cant and your a fool if you try.

You can show me the door but i will simply accept that as a victory and take it as you failing to come up with any real substantial argument.

Sorry if i have offended anyone else in the forum. i never set out to do anything of that nature and i didn't forsee this request turning into a one on one debarcle


And yes, you are being sarcastic and insulting...

And just so you know sarcasm is based on wether i intended the remark to be untrue and insincere. so by definition i was being honest to the rest of the members of the forum and generally showing that i think forums are great places to meet people and share knowledge. i was NOT being sarcastic. seems to me like someone simply has a bit of moderator inferiority complex and needs to feel like they know and control everything. As for insulting i didn't make any comment on that, i didnt mean to but insult is a personal thing and depends on how you percieve my comments, so if i did insult you...initially i would have apologised but you are showing to be quite an annoying personality so tough $hit. grow up and learn to be a little more realistic.

Budfred
10-16-2006, 11:16 PM
You clearly don't understand the concept of ethics and you revert to petty personal attacks and misinterpretation to defend your stance that it is okay to steal databases that clearly say it is illegal to do so... I hope your teachers and future employers understand your ethical stance and act accordingly...

Just so you know, I really don't care if you get it... I suspect you have powerful abilities to rationalize your dishonest behavior and general rudeness, so I will not be able to influence you to take a more honorable approach to an honorable profession... I am sad about that, but resigned to it... I am concerned about others who read this and wish to make it clear that your approach is not endorsed or accepted by this forum...

Paul Komski
10-17-2006, 10:09 AM
Please don't let this get personal or out of hand folks.

Downloading any material from publicly accessible websites could hardly be an illegal or immoral act. If from a private site then that depends on any agreements that have been (possibly unknowingly) entered into. The way to be sure of that status is to ask the appropriate personnel whether such actions are admissable or not. No rocket science involved.

Using copyrighted material of any kind inappropriately and without consent would be illegal whether dowloaded or written out by hand.