HNSSearch Demo w/ Andreas Renz

Transcript

00:00
foreign [Music] foreign [Music] [Music] this has been intense day one last but not least last but not least Andy at least I think he says okay I call him Andy I remember we started chatting on handshake mercenary did BBS like a couple years ago now so um

01:04
it’s a pleasure to have you here and to see you continuing to build and contribute towards uh handshake and and just like this title of the show is uh you know freeing the internet you know that’s what it’s about so um I we’re a little bit past time and I will pass the stage to you is it screen share demo PowerPoint what do we got I will I will screenshot I will show a presentation just to be sure that everything runs smoothly so awesome thank you for the intro thank you for having me again three I will disappear

01:51
all right so thank you hi everyone awesome to be here again for the third time I hope I can still tell you something new and for everyone who hasn’t seen or heard about hns search I hope you can learn something and you so I will be talking about hns search the current state and uh where I would like it to go in the future see so who are we last year when I presented this we were two people working on hns search so far uh the other person uh jumped off so actually right now it’s just me so it’s a bit of

02:29
a one-man show doing it on the side but still having a lot of fun doing it so yeah let’s see what I’ve been up to in the last couple of months I did take a break last year just to sort some things out but now back a bit with full force so when we talk about search engines I think it’s important to quickly understand the basic architecture of a search engine so it all starts with The Crawling so crawling is basically searching the internet searching other websites for Content based on certain rules in our

03:08
case just rules that check whether a site is in hns a handshake site or not then the crawler itself or separate entity will index all of the content it finds put it in a database so it can be searched later and at the end of course search engine wouldn’t be much value if you couldn’t be searching so the search aspect is also important at hns search we have actually divided the crawling and the indexing into one piece of software and the searching into another one the idea here is that you can use each one separately

03:51
so if you have your own index you could use the search front-end and hook it up to your index or if you want to use the hns search index you can build your own front-end and connect to the index if we look at the basic architecture at the bottom we have Linda which is our indexer I’ll be speaking about that in a bit dweb pulse which is basically the the home of the index where it will be accessible the public index and the search engine which is the front end the front end is the part which will be centralized it will be centralized

04:37
because that’s just the hosted website but everything then down in the pyramid will be eventually decentralized and publicly available so Linda at the bottom of the pyramid is so-called because it’s living index or navigating decentralized assets she is basically our crawler She searches the web the d-web in that case and uh provides us with all the necessary websites so some technical details she’s written in Python she controls which sites are being chroned how they are being indexed which content is being scraped from the web

05:23
and is also doing a machine learning analysis where she checks what type of content is on the website is the news website is it the blog and then store step information within a category on the database that will come in handy for the web pulse as you’ll see shortly she is built for handshake websites that’s the main purpose but can be extended to other protocols very easily is currently closed Source but will be open sourced and made publicly available once she’s finished but she still has a lot of work to do

06:06
then the second party web pulse not yet fully implemented but is basically the open index that is provided by Linda with the categories and it gives every user the possibility to check for a certain domain or website what the categories and then provide feedback whether that category is wrong or if it’s right you don’t have to do anything and that will come in handy a bit later you’ll see how these categories can be used so the web pulsing itself is a public index of hns websites it provides the content-based categorization any user

06:51
can then make change requests and based on those categories you will be able to create a blog list templates for your search so even within hns search you will be able to say I don’t want to see this category I don’t want to see this type of content and those categories come from a dwork pulse that’s why it’s also important that everyone can give their feedback and request category changes uh also of course as the machine learning model will not be 100 perfect and it will offer a great and and yeah

07:31
great API for other search engines where you can directly hook up to the index and create your own search engine so you’re not dependent on the hns search front end in itself this one is still under development um hoping to push it out for testing soon so I’ll keep you posted once that’s live then at the top the public index that by the way just comes with the web pulse and at the top we have hns search the front end itself and now we are actually in Alpha Testing and we actually so far indexed about

08:19
125 unique tlds that amounts to 210 different sites and in total with indexed 621 pages the crawler is running day and night but it just has a bit trouble finding all your handshake sites so therefore we’ll provide the missing site form uh soon where you can actually let me know if a website is missing I will go into detail a bit later how The Crawling actually works for the alpha testers uh I’ve received a lot of great feedback thank you everyone who helped discover new bugs or requested new features it’s been awesome

09:07
so far um we’ve actually amounted to 300 unique visitors that did 1200 pages together and came from 33 different countries now when you see such statistics I can imagine that you’re worried about tracking or or like user data being stored I can assure you that we do not store any user data like even your search results or your Search terms are not tracked I I do not see those we are using or I’m using plausible as a which is an open source Analytics tool and I just see those numbers like I’m not

09:51
tracking anything else and I like I said I don’t see Search terms but I think it’s interesting just to see how many views and how many visitors we have just also to to see what’s going on So yeah thank you everyone for for testing it’s still open for testing so you can go ahead and visit it at hnssearch.

10:19
io the handshake URL will also be available in time just now focused on other things to develop but it is open keep in mind it is still in testing so there might be bugs or performance issues but if you if you notice something just let me know you can hit us up on Twitter at hns search or in the handshake Discord feedback is highly appreciated because that will help us grow and develop it further on the front end UI side um you can see it’s a it’s a standard search UI it is in it comes in dark and light mode based on your preferences of

11:05
the operating system you can search a term and it will uh give you back the desired results we are using by the way Miley search in the back which is a super cool search engine that is typo resistance so even if you would like here put the shake without an a it would still find you good results and Miley search is really fast so it’s really awesome to integrate I just dropped the new feature yesterday which I like to call info boxes where if you search for something related that has a page on handy pedia it will show up as a small info box on

11:54
the right side this only on desktop and tablets I don’t want to crowd the mobile experience too much so this feature is only on mobile only on desktop and tablet and not on mobile so if you have a project feel free or please add a page to Handy pedia which you might I think it’s really a great resource so please consider adding a page or hit me up and I can help you with creating one because that will just increase also our index for the for the info boxes actually one other feature which we have in hns search is the shortcuts which

12:44
lets you directly go to another website by using these shortcuts I will quickly show you a demonstration and then you can see what shortcuts are available hopefully you get a better sense of it so as you can see you enter the shortcut and then you enter your search term you search and it automatically redirects you to your desired site so if we have a look those are the shortcuts which are currently available if you have the desire for more or another shortcut just let me know this can be implemented easily and I’ll be happy to integrate your

13:29
favorite handshake related site or we can see what else you have in mind so what is currently being developed on hns search so the next few features that are going to be coming out is a website list it was requested that the whole index is just browsable without having to search so I’m working on that just so you can actually see what has been built because I think one of the big challenges now is to not not necessarily find what you’re looking for but actually not knowing what you’re looking for because we are

14:11
still we are still in the early days and there are not that many websites so that website list is coming soon I’ll be adding infinite scrolling on mobile so you don’t have to use pagination the missing site form as mentioned before where you can let me know if there are websites missing and the index will be publicly released together with the testing website of the web pulse maybe a little bit more about the missing sites so Linda actually acts as a likes standard crawler in the sense that she actually goes to a starting page and

14:56
then she finds all the URLs on that page and from there she spreads and finds uh further Pages until she can’t find any more pages so if you have a website on handshake that doesn’t link or that no other site links to Linda will have a hard time finding it so therefore we’ll uh provide this missing site form but just if you don’t see your site just let me know I can add it to the starting list and it will be crawled within a day um but just so you understand why sometimes sites don’t appear

15:35
or why you cannot find them so a bit more about the vision now that we talked about this the state of hns search so that’s what’s currently available the front end is already open source you can find it on the GitHub page you can clone it you can use it you can adapt it and the rest will be open sourced as soon as it’s ready so for division division is the free and open source crawler indexer which has open and distributed index distributed in the sense that it’s not only stored on one server but actually distributed across

16:19
many nodes many servers so that it actually cannot be taken down it should be at all times accessible and also not dependent on just hns search the vision is also that once this index is decentralized and publicly available that other people will build their own front-ends and will integrate the index or extend their existing indexes and that you as a user get full control full control in the sense you decide what you want to see and what you don’t want to see we won’t block your search results so you actually really have full control

17:02
the search results are also provided based on content so there is no excuse me elaborate ranking mechanism but it’s really what suits your search term the most then as already mentioned no data logging or tracking so we do not wish to monitor everything you do on hns search but actually let you enjoy the privacy of your own searches and as mentioned a network of nodes where the index is so distributed it cannot be taken down so what is the concrete vision for hns search itself so hns search will will be the

17:51
Standalone website which acts as a front-end it is a convenience thing where you again give up a bit of the decentralization as we heard already earlier in in talks it is centralized but we try to make it as decentralized as possible but in the end if you use hns search itself you are dependent on its uptime but that’s why the index itself should should and will be decentralized so you will always have access to it um we will integrate the block lists where you can choose yourself based on category circles

18:31
what you want to see and what you don’t want to see and the idea here is that hns search will build a circle but also renowned projects around handshake can build their own Circle and you can just use their Circle that will be a config URL which you can post into hns search and we will apply those so if you wish to use other maybe stricter filters other companies projects can provide their own Circle and your search results will be filtered accordingly it will be fully open source so anyone can use the code take the code modify

19:10
the code and of course also contribute if they see fate the index is swappable so or will be swappable which means that you can input another index into the front end so if you have an index online but you don’t want to if you don’t want to use or build your own front end you will be able to just in the settings attach your index and you will search that through the front end will be providing a custom toolkit so it’s super easy to create your own search engines I think search should be spread across and

19:53
you should have full control so we want to make it as easy as possible for you and then the the end Vision or the ultimate vision is a desktop client where we really will have a fully decentralized search engine with index sharing so you will have the the control declined will be on your desktop as soon as you boot it up you will connect to the decentralized index you will be hosting the index yourself it will sync with everything that’s happened and you will have the opportunity to launch your own crawler which then

20:34
pumps sites back into the back into the index so it will be really an index maintained by the community and you have also here the control whether you want to provide information to the index or not you can also just crawl for yourself and keep your private index which then you will have just to host yourself um with the crawler I’m still working on the incentive structure where you actually do get paid or get some incentives to actually running a crawler because it would be awesome if it wouldn’t just depend on

21:15
one or two crawlers so we can really keep extending uh everything handshake or also other d-web protocols as I mentioned Linda is very easily extendable to other protocols even to the standard web it’s uh just some settings that need to be modified yeah I think that’s it for the quick overview of the state of what’s coming so I’ll be working again on it with a high pressure I really love working on it it was good that took this break but I’m I’m glad I’m back to it and the feedback so far has been amazing so

22:03
thank you everyone who has been the tester thank you to everyone that will be a tester again you can reach it at agnesearch.io you can hit us up on Twitter at hns search if you have any questions or anything just ask us or ask me I’ll be happy to help you to to talk to you or to answer any questions and I see we have already also questions here so I think I’ll go through them so the first one I can see it has hns search established any preferences for SEO in our site pages that would facilitate its function

22:49
we we do crawl robots.txt files and sitemaps so if you have a site map which is basically a tree of your structure and then that will help of course discover all your pages and robots.txt which is basically a standard protocol which tells a crawler what to crawl and what not to crawl so Linda respects that which means you can put sites you specifically don’t want to be crawled in there and we will leave it be other than that it’s really just how well your content matches the search term so what doesn’t work for

23:35
example is keyword spamming so if you put for example the word fund uh 50 times in your content it doesn’t mean it doesn’t mean that it will rank more uh it will rank higher if you search for fun so it’s really the how well the content matches the search term but other than that yeah just create the robots.

24:06
txt file and the sitemap sitemaps would be great because it’s really difficult to if you don’t link to your pages to find them great so let’s see I see now the questions are sunscreen I think that one should be answered I hope if we expand the sides link how long before it would be crawled again or could be submitted to the missing site form if you expand it right now we’re still figuring out I’m still figuring out what the update rate of pages is currently it’s one week but if you add it to the missing sites it’s one day once the

24:45
missing site form is because the crawler does multiple cycles per day so far the refresh rate is set to one week it will be adjusted based on it will be adjusted based on the update frequency of a website so Linda is actually now building a model where she where she actually is checking um how often sites are updated and will update the the refresh rate according will always be maximum a week but for example for websites that are known to change every day the refresh rate will go down good let’s see if there are any

25:36
so for feature requests currently you can either write them on the handshake Discord there is a there is a project page for hns search you can send them on Twitter or you can open the issue on GitHub if that’s more your style I will be adding a feature board where you can add features and upload them but that’s also so far in the works but any requests are are highly appreciated and we’ve been known to implement them rather quickly if they’re easily doable otherwise they will for sure be put on

26:12
the list and will be implemented later but yeah those those are the channels you can find us currently we’re not or I’m not actively looking for Investments we’re always happy uh for any help or any donations or any grants that we or I can receive since this is a side job I would call it and we do not have any active funding but there are actually many ways you can contribute to hns search one of which is test testing the site another one is actually building a handshake websites then also as I mentioned

26:55
providing handy pedia Pages which then can be crawled or reviewing our code or just letting us know what you would like to see donations are appreciated of course since this is a side project or a an unpaid job which I do on the side from my main job they’re not necessary I mean it’s already super nice if people give feedback test the site and there is actually a on hns search if you go on the top to the about section in the documentation there is a how to contribute site which we just launched yesterday where you can

27:38
see all the ways you can actually help and contribute I’m currently curious the difference between dns-based sites traditional sites hns and ipfs sites for search engine so in the site itself there is not much difference just h s such sites tend to be more programs like the sites from from the early internet days but Tech wise there is not much difference ipfest sites they do tend to behave a bit differently when you when you call them on the URL but in terms of technical things there is really not much difference besides the

28:20
fact on how you reach them I think that’s the tricky part when creating a crawler is how do you actually tell Frameworks that exist to go to an hls site and how do we discover if something is an hns site or if it’s a I can say but the technical things and how you build the websites is is basically the same depending on of course what you want to integrate but yeah I think we covered all the questions um so yeah thank you very much for all your questions for your interest if you have more questions just

29:03
hit us up on Twitter you can find the stereo here now on at a handy con I’ll be around just taking up in the chat thank you everyone for your feedback for testing hns search and for all the love you send our way it’s really a big motivator to keep on doing it and I really hope we can make the vision come through and provide everyone with a fully decentralized search engine yeah I think that’s it and I think we even stayed within the time slot Which hat I got my OG legit hat let me put that one on nice

29:52
it’s just not as you know like the flamingo sticks up so normal hats don’t fit that’s why I have this other one that the flip up anyway great oh the Kirk’s here see that’s legit anyways thanks Andy really really um appreciate what you’re doing it’s like a side hustle building a search engine for the decentralized internet that’s pretty crazy when you think about it it’s like oh I’m on the side building a search engine for decentralized web and that’s yeah it’s it’s a wild ride but it’s

30:24
awesome it’s fun that’s great I mean we really commend you and you know all of us we’re all you know I know a lot of others are you know doing this even me I’m an Amazon Seller and e-commerce you know marketing company uh on my TV by day and at night I wear Flamingo glasses and my daughter doesn’t know what I really do and tells her parents crazy or her teacher’s crazy stories about me but anyway I really appreciate it I mean there were some other questions in chat I think from encrypted it wasn’t in a q

30:54
a but yeah I think the main point is you just seem to want more feedback and uh um more rest basically seems like your main call to action is people giving you more suggestions um I don’t know if you answered it but somebody says where can I put feature requests or feedback is there like a Wiki or uh or a board like something like this feature like I said on this Corridor on on Twitter so far on GitHub on the repo of the front end but there will be a feature board released soon because I think that’s okay

31:29
all right well I know it’s late for you late for everybody but yeah I think we’re basically on time so uh yeah I think uh unless you have any other points I really appreciate it I think everyone’s here thanking you and yeah just to thank you also for setting up handycon it’s awesome for the third time in a row and I’m really grateful yeah to be on again yeah no we we need we we are we all uh do it do what we can I’m not I’m not a developer uh so I do this and you know you’re a developer and you’re doing that

32:02
so that’s what it’s about right we do what we’re good at and uh um just keep pushing our pushing our limits and contributing um but yeah I’m I guess uh yeah I mean it’s always I get a little bit emotional with these things we just got to keep pushing pushing our boundaries as a community you know I’m seeing some threads already from like Raul left a long thread on Twitter I was reading but yeah I think let’s let’s wrap up day one um here we go another day exactly like this tomorrow with jam-packed sessions

32:33
tomorrow’s gonna be more about slds today was kind of protocol level and Tech tomorrow’s sld discussions and uh but I think I have a closing session I don’t do I have to do that penny all right well let’s wrap this one up I guess I’ll just say some closing words um and then for those who want to stick around and then I guess we got a leaderboard Awards so um I think we’ll end the hns search session with Andy thank you again and then if you want to stick around for five ten more minutes

32:59
we’ll do a closing all right yeah I’ll do the closing in a second thanks thanks Andy [Music]