Index Scrape/Crawl of Solidworks Forum

Make suggestion to the moderators and admins of this site.
jmongi
Posts: 101
Joined: Wed Mar 24, 2021 1:25 pm
Answers: 0
x 15
x 81

Index Scrape/Crawl of Solidworks Forum

Unread post by jmongi »

Before the "transition", would it be feasible to scrape/index the current solidworks userforum to at least get it into a useable/searchable file that could be messed around with by future more ambitious programmers?

I'm just assuming that it will be much easier to do (if it's possible) in its current incarnation than attempting to do anything like that post transition to the swamp. Just a thought for those with way more programming experience than me.

I would think you could write a script to systematically navigate through threads, pull the source (HTML?) copy it to a file and repeat. As I said, the information on its own might not be very usable in that type of format. But, then it would at least be available to be transformed in the future.
Designated Pot-Stirrer
User avatar
SPerman
Posts: 1834
Joined: Wed Mar 17, 2021 4:24 pm
Answers: 13
x 2014
x 1688
Contact:

Re: Index Scrape/Crawl of Solidworks Forum

Unread post by SPerman »

I am hoping the waybackmachine will take care of that for us.

https://web.archive.org/web/20201202134 ... solidworks
-
I may not have gone where I intended to go, but I think I have ended up where I needed to be. -Douglas Adams
User avatar
matt
Posts: 1536
Joined: Mon Mar 08, 2021 11:34 am
Answers: 18
Location: Virginia
x 1158
x 2293
Contact:

Re: Index Scrape/Crawl of Solidworks Forum

Unread post by matt »

jmongi wrote: Thu Mar 25, 2021 8:33 am Before the "transition", would it be feasible to scrape/index the current solidworks userforum to at least get it into a useable/searchable file that could be messed around with by future more ambitious programmers?

I'm just assuming that it will be much easier to do (if it's possible) in its current incarnation than attempting to do anything like that post transition to the swamp. Just a thought for those with way more programming experience than me.

I would think you could write a script to systematically navigate through threads, pull the source (HTML?) copy it to a file and repeat. As I said, the information on its own might not be very usable in that type of format. But, then it would at least be available to be transformed in the future.
There are some software packages that do this. In the first week we were up, we had a minor scandal where someone actually started doing that and then posted it here. The SW Forum has as part of it's terms of use that you cannot post the content of the SW Forum publicly in another place. So that pretty much covers that.

But they can't control (or more importantly litigate) if you give an account of the same content in your own words. (basically, don't copy/paste anything, but you can summarize or elaborate or this or that, but please don't copy or scrape and then paste here). I want to make it on our own merits rather than resort to copying content (and fending legal jousting).
User avatar
jcapriotti
Posts: 1792
Joined: Wed Mar 10, 2021 6:39 pm
Answers: 29
Location: The south
x 1131
x 1940

Re: Index Scrape/Crawl of Solidworks Forum

Unread post by jcapriotti »

Yeah, I imagine the Dassault legal team is more competent than the 3dswym team. :o
Jason
jmongi
Posts: 101
Joined: Wed Mar 24, 2021 1:25 pm
Answers: 0
x 15
x 81

Re: Index Scrape/Crawl of Solidworks Forum

Unread post by jmongi »

I didn't consider the legal aspects of such an activity. Good point.
Designated Pot-Stirrer
colt
Posts: 54
Joined: Tue Mar 30, 2021 5:43 pm
Answers: 0
x 14
x 22

Re: Index Scrape/Crawl of Solidworks Forum

Unread post by colt »

@SPerman The wayback machine will work great for surface content, but I don't think it will keep the source files for post attachments like macros or models. Will this stuff be completely lost or is it going to be transferred to the new forum?
Post Reply